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Abstract 

In  this  report,  a  formalization  of  the  object-oriented  data  model  is  proposed,  which  inte¬ 
grates  value-oriented  models  and  object-oriented  models  by  providing  a  simple  semantics  of 
object-identity. 

The  formalism  reveals  that  the  semantics  of  the  object-oriented  model  consists  of  two 
portions.  One  is  expressed  by  an  algebraic  construct,  which  has  essentially  a  value-oriented  se¬ 
mantics.  The  other  is  expressed  by  object-identities,  which  characterize  the  essential  difference 
of  the  object-oriented  model  from  value-oriented  models,  such  as  the  relational  model  and  the 
logical  database  model.  The  value-oriented  portion  represents  the  abstraction  of  the  real  world 
objects,  while  the  object-oriented  portion  represents  the  existence  of  the  real  world  objects. 
These  two  portions  are  integrated  by  a  simple  commutative  diagram  of  modeling  functions. 

The  formalism  includes  the  expression  of  integrity  constraints  in  its  construct  of  classes, 
which  provides  the  natural  integration  of  the  logical  database  model  and  the  object-oriented 
database  model.  More  specifically,  we  will  show  that  a  datalog  program  can  be  expressed  as  a 
collection  of  classes  in  our  model. 

As  an  application  of  the  formalism,  formal  guidelines  on  database  design  are  also  discussed. 
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1  Introduction 

In  recent  years,  many  attempts  have  been  made  to  formalize  the  semantics  of  the  object- 
oriented  model.  As  t lie  result  of  these  efforts,  several  models  have  been  proposed  [AK  89], 
[LR  89],  [KW  89].[CW  89].  Roughly  speaking,  these  models  are  logical  database  models  with 
typed  variables.  Their  approach  is  to  incorporate  a  structured  knowledge  representation,  such 
as  complex  objects,  object-hierarchy,  into  a  logical  representation  paradigm.  However,  the 
semantics  of  object-identity  is  not  captured  in  these  models.  Although  [AK  89]  formalize 
object-identity  in  their  model,  the  semantics  remains  complicated.  Basically,  what  they  have 
done  is  to  "push”  object-identity  into  a  value-oriented  framework  consisting  of  logic  and  types. 
However,  as  discussed  later,  the  notion  of  object-identity  is  something  that  will  never  fit  into 
the  value-oriented  paradigm. 

In  this  report,  a  formal  semantics  of  an  object-oriented  model  is  proposed  ,  which  approaches 
the  issue  from  the  opposite  direction.  We  try  to  incorporate  a  logical  knowledge  representation 
into  a  structured  knowledge  representation  paradigm.  We  will  show  that  our  approach  pro¬ 
vides  a  natural  formalization  of  object-identity  and  a  simple  integration  of  the  object-oriented 
paradigm  and  the  value-oriented  paradigm. 

This  report  has  two  main  objectives.  One  is  to  provide  simple  and  elegant  semantics  of 
object-identity,  which  integrates  value-oriented  models  and  object-oriented  models.  The  other 
is  to  extend  the  formalization  of  objects  so  that  the  integrity  constraints  are  included. 

1.1  Formalization  of  Object-identity 

In  this  section,  we  first  provide  an  overview  of  the  origin  and  the  role  of  object-identity  in 
knowledge  representation,  using  the  discussions  in  the  literature  listed  above.  Then,  we  provide 
an  outline  of  our  formalization  of  object-identity. 

The  semantics  of  object-identity  is  obtained  by  considering  a  basic  aspect  of  a  knowledge 
representation.  Namely,  any  knowledge  representation  is  only  an  approximation  of  the  real 
world  knowledge.  The  existence  of  objects  in  the  real  world  cannot  be  captured  by  the  values 
of  expressions.  We  consider  an  example.  Let  us  assume  that  a  concept  'person’  *s  expressed 
by  name  and  address  according  to  the  following  schema  in  the  sense  of  [AK  S9]1. 

Location  =  [city.Strrng .  street:String,nvmber\Integer ], 

Person  =  [name:[first -.String,  last-.Slring],addre$s'.Location). 

In  most  cases,  we  can  completely  identify  each  individual  person  by  providing  the  name  and 
address.  However,  there  is  a  possibility  that  two  distinct  persons  with  the  same  name  are 
living  at  the  same  place.  The  occurrence  of  these  persons  cannot  be  characterized  by  the 
values  of  attributes  ‘name'  and  ‘address'.  We  can  come  up  with  two  relevant  solutions  for 
this  problem.  One  is  to  provide  more  attributes  for  expressing  the  concept  'person'.  However, 
the  reed  attributes  of  a  person  are  almost  infinite  in  number.  So.  even  if  we  introduce  many 
attributes  for  ‘person’,  we  cannot  eliminate  the  possibility  that  some  distinct  persons  are 
expressed  by  the  same  set  of  attribute  values.  The  other  solution  is  to  provide  a  key  attribute 
to  express  the  uniqueness  of  each  individual  person.  However,  this  does  not  provide  a  natural 
way  of  expressing  the  real  world,  because  it  is  an  artificial  attribute.  We  cannot  avoid  the 
unnecessary  semantics  of  the  key  attribute.  For  example,  a  ‘sncial-security-number'  may  be 

'We  use  the  notation  explained  in  [AK  89]  for  the  moment 


implemented  as  either  an  integer  or  a  string  consisting  of  digit  characters.  In  order  to  deline 
the  equality  of  objects,  we  have  to  define  it  as  equality  of  integer,  or  equality  of  string  according 
to  the  “implementation.”  Further,  we  have  to  express  the  maintenance  of  the  key  attribute 
explicitly  in  the  higher  level  semantics.  For  example,  "Once  an  instance  is  created,  the  key 
attribute  should  not  be  altered",  “there  should  not  be  more  than  one  instance  whose  key 
attributes  are  identical."  Since  t  he  semantics  of  “real  existence  of  objects"  is  just  that  of  a  mt 
wdtli  the  equality  relation,  it  is  not  desirable  that  the  semantics  of  the  implementation  appears 
in  higher  level  semantics. 

The  problem  is  essentially  due  to  the  inherent  incompleteness  of  our  representation.  There¬ 
fore,  rather  than  expressing  the  uniqueness  of  an  occurrence  in  the  real  world  by  attribute  val¬ 
ues,  we  need  something  that  specifies  the  existence  of  occurrence.  The  object-identity  serves 
this  role.  It  is  important  that  an  object-identity  is  not  a  value.  Instead,  it  is  an  entry  / mint 
for  information  access  in  our  knowledge.  In  other  words,  it  is  the  reference  to  knowledgebase. 
Hence,  as  discussed  in  [LH  89].  it  provides  the  basis  for  object  sharing,  which  is  the  most 
important  advantage  of  introducing  object-identities  in  a  practical  system. 

Let  us  come  back  to  the  previous  example.  Suppose  that  a  person  named  "John  Ford” 
lives  at  “2260  Vale  Street  Palo  Alto”.  Moreover,  suppose  that  a  person  named  "Mary  Carter" 
lives  with  him.  These  facts  are  expressed  by: 

‘POOF  =  [name:[f  irst:'L  John”  Jast:“  Ford"].  address:1  L010'}. 

'  P002'  =  [  n  ame:[f  irst:"  M  ary ”  Jast'PCarter” } ,  address:'  L0 1  O'] , 

L010'  =  \cityPPaloAlto"  ^treetPYale” ,  number:2260]. 

What  happens  if  the  name  of  the  street  where  John  lives  is  changed  from  “Yale”  to  “Harvard"? 
Since  John  lives  at  the  location  'L010\  the  expression  of  the  location  becomes: 

’1010'  =  [city:"  Palo  Alto” ,  street:uH  arvard" ,  number:'2‘260]. 

Hence,  after  the  change,  we  can  say  that  both  John  and  Mary  are  living  on  Harvard  Street. 
The  point  is  that  'L0101  corresponds  to  the  existing  location  on  earth,  and  John  and  Mary's 
address  is  expressed  bv  referring  to  'L010’.  Thus,  when  its  street  name  has  been  changed,  the 
change  is  propagated  properly. 

So  far,  we  have  seen  the  origin  of  object-identity  and  the  role  of  object-identity  in  the 
knowledge  representation.  To  summarize: 

•  The  object-identity  corresponds  to  the  real  existence  of  objects  in  the  real  world,  which 
cannot  be  captured  by  the  the  value  of  expression. 

•  The  object-identity  provides  the  basis  of  object-sharing.  An  object-identity  is  the  refer¬ 
ence  to  represented  knowledge,  which  is  exactly  what  is  to  be  shared. 

Next  we  claim  that  in  order  to  take  full  advantage  of  object-sharing,  attribute  values  of  an 
object  should  be  object-identities. 

(AK  89],  [CW  89]  allow  complex  values’  as  the  values  of  attributes.  It  provides  us  the 
complicated  expression  of  objects.  Namely,  in  the  above  example,  [first:"  John",  last:"  Ponl"} 
is  a  complex(structured)  value.  However,  this  approach  has  a  disadvantage.  If  we  allow 
complex  values,  there  is  an  inherent  possibility  that  the  subexpression  of  a  complex  value 

-  We  use  the  term  "complex  value  '  instead  of  "complex  object"  They  don't  carry  object-identity 
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would  be  changed.  Since  a  substructure  of  a  value  cannot  be  shared,  it  will  cause  costly 
update  maintenance.  Of  course,  the  schema  is  designed  so  that  the  attribute  value  of  'name' 
is  really  a  value  and  not  sharable,  because  it  is  quite  natural  to  express  a  person’s  name  as  a 
value.  However,  even  in  this  case,  we  can  show  an  example  that  demonstrates  the  necessity  of 
sharing  objects. 

Let  us  consider  an  additional  concept.  ‘BusinessCard'. 

B  usuitssCurd  =  [compan y.String,  title:String.  name:[first:String ,  last:String]] 

Assume  that  John's  business  card  is  expressed  by: 

‘/JO  1  P  =  [company:" C D B" ,  titled'salesman" ,  name:[f  irst:" J ohn“ ,  last:" Ford"]] 

What  happens  if  John  marries  Mary  and  changes  his  last  name  to  "Carter"?  We  have  to  create 
a  new  value: 

[f  irst:"  John"  ,last:"Cur  ter"]. 

and  replace 

[f  irst:"  J  oh n" ,  last:" Ford"]. 

The  creation  o!  the  new  value  will  be  costly  when  the  structure  is  large.  Furthermore,  we  have 
to  replace  name’  of  both  ‘P001’  and  ‘BOll'. 

If  there  is  no  need  for  the  object-sharing,  the  complex  value  would  be  reasonable.  However, 
if  we  have  more  than  one  concept  that  shares  a  same  value,  as  in  above  example,  we  should 
incorporate  with  object-sharing.  Thus,  in  this  case,  the  following  schema  will  be  preferable. 


Name  —  [f  irst:S tring,  last:S tring]. 

Person  =  [name:N ame.  address:Location], 

BusinessCard  =  [company.String,  title:S tring,  name:N ame). 

The  point  is  that  every  attribute  should  refer  to  an  object  with  object-identity.  Therefore, 
it  is  not  desirable  to  design  such  a  schema  as  the  original  'Person'  with  complex-value  [first: 
String,  last -.String]  as  attribute  value.  The  schema  must  be  changed  dramatically  when  we 
add  a  new  schema  object  like  ‘BusinessCard". 

In  order  to  demonstrate  the  idea  more  clearly,  we  repeat  the  discussion  with  the  following 
schema.  In  this  case,  the  attribute  is  not  a  complex  value,  but  just  a  value. 

Person  =  [name:S tring.  rmployer.String] 

BusinessCard  =  [com pan y:S tring .  name.String], 

The  information  about  John  will  be  expressed  by: 

‘POOP  =  [name:".]  ohnFord"  ,eniployer:"C  D  fi”) 

'  Bi)  1  P  =  [compan  y:"C D B" ,  title:"  salesman" .  name:"  John  Ford"] 

If  he  changes  his  company  from  "CDB”  to  "HAL",  we  have  to  change  the  employer  of  POOP 
and  the  company  of ‘B0I1L  Therefore,  rather  than  having  value-attribute,  we  should  have 
only  attribute  referring  the  object-identity  of  other  objects.  Xainelv, 

'/■’OOP  =  [immr.'.YOlT.  <  mploye  r.'/.'OOO], 


‘5011'  =  [company;  EQQO' .  titled  Sill'.  nnme:‘ .50 12']. 

‘N012’  can  be  associated  with  a  string  value  "CDB”  or  “HAL.” 

To  summarize,  in  order  to  make  use  of  ob  ject-sharing  fully,  it  is  preferable  that  a  schema 
object  doesn’t  have  ‘value’  as  an  attribute  value3.  Instead,  attribute  value  should  be  an  object- 
identity  referring  to  another  object  instance.  In  particular,  it  is  not  desirable  to  have  complex 
values  as  attribute  values. 

Moreover,  since  the  attribute  names,  such  as  ’name’, ‘employer’,  can  be  regarded  as  access 
functions,  we  get  the  following  flat  representation. 

name(% TOOl' )  =  '.V012'.  employer^ P001')  =  '1:000', 

company^ BOH1)  =  '7:000',  titleC  BOll1)  =  ‘51 1  T,  nameCBOU')  =  "P001'. 

Thus  the  information  about  .John  is  expressed  by  the  partial  functions  from  object-identities  to 
object-identities.  We  call  this  representation  space  object-identity  spare ,  which  will  be  precisely 
formalized  in  Section  4.2.  The  semantics  of  this  representation  is  quite  simple. 

However,  the  above  representation  does  not  have  an  important  feature  of  object-oriented 
representation.  That  is  the  explicit  structural  representation  of  knowledge.  One  of  the  big 
advantages  of  frame  or  complex  object  in  knowledge  representation  is  that  they  provide  the 
structure  of  knowledge  that  we  can  easily  imagine  and  manage.  Of  course,  we  can  express  the 
semantics  of  complex-object  in  first  order  logic  by  some  transformation  (CW  89].  However,  if 
we  express  it  in  first  order  sentences  or  formulas,  the  structure  is  concealed  in  the  semantics  of 
sentences.  Hence  we  have  to  interpret  the  first  order  sentences  to  get  the  structure.  Therefore, 
we  should  integrate  the  object-identity  space  with  a  structured  complex-value  representation. 
In  Chapter  4,  we  have  a  simple  and  elegant  formalization  that  integrates  them.  The  outline 
of  the  integration  is  as  follows.  First,  we  provide  the  syntactical  construct  of  schema  objects. 
Next,  we  provide  the  value-oriented  model,  i.e.  an  algebraic  mode!  with  (complex)  values. 
Then  we  provide  the  model  expressed  by  the  object-identity  space.  Finally,  we  provide  the 
mapping  that  combines  object-identity  space  and  algebraic  representation  of  complex- values. 
The  compatibility  of  object-identity  space  representation  and  algebraic  representation  is  ex¬ 
pressed  by  a  simple  commutative  diagram. 

1.2  Integrity  Constraints 

In  the  conventional  approach  as  [AK  89],  [KW  89].  schema  objects  are  defined  with  the  struc¬ 
ture  expressed  by  types.  Then  logical  formulas  are  constructed  on  top  of  the  objects  (Rules  in 
[AI\  89],  O-formulas  in  [KW  89]). 

In  our  model,  each  schema  object,  called  C-cluss.  consists  of  type  and  a  restriction  pred¬ 
icate.  The  type  expresses  the  structure  of  knowledge  representation,  which  will  be  referred 
to  as  a  complex  object,  a  hierarchy  of  objects  in  conventional  object-oriented  models.  The 
restriction  predicate  will  express  the  integrity  constraint  of  the  representation.  Let  us  consider 
"absolute  temperature”  as  a  simple  example.  It.  can  be  expressed  bv  the  positive  real  numbers. 
The  structure  will  be  realized  by  the  algebra  R  with  operations  +.  — ,  *  etc.  The  integrity 
constraints  will  be  expressed  by  the  predicate  /f(.v)  =  (  r  >  0)  expressing  "positiveness. 

My  including  the  integrity  constraints  as  the  basic  component  of  each  object,  we  can  show 
that  everv  unit  of  knowledge  can  be  expressed  by  objects.  Even  a  logical  formula  can  be 

'1  lie  term  "attribute  value"  is  not  a  nice  terminology  Maybe  it  should  be  '.eiessed  attribute 
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expressed  by  an  object.  In  the  conventional  approach,  a  logical  forniula(ground  fact)  is  a  value 
in  the  sense  that  if  every  substructure  of  two  logical  formulas  are  the  same,  then  those  logical 
formulas  are  the  same.  However,  as  discussed  in  Chapter  5,  even  a  logical  formula  cannot 
be  treated  as  a  value,  due  to  the  inherent  incompleteness  of  our  knowledge  representation. 
Rather,  it  should  be  expressed  by  an  object  that  carries  a  unique  object-identity. 

If  we  express  knowledge  by  objects,  we  can  provide  a  representation  of  the  real  world  that  is 
closer  to  our  intuition  than  expressing  knowledge  by  logical  formulas  on  complex  ohjirts.  We 
will  discuss  this  matter  in  detail  in  Chapter  5. 

1.3  Outline 

The  outline  of  this  report  is  as  follows. 

In  Chapter  2.  we  introduce  a  notion  of  data  algebra  that  is  an  abstraction  of  data.  Roughly 
speaking,  the  data  algebra  is  the  combination  of  type  and  integrity  constraints.  The  type  part 
is  expressed  by  a  universal  algebra ,  and  the  integrity  constraints  part  is  expressed  by  a  boolean 
function.  The  data  algebra  provides  the  basis  for  the  semantics  of  value-oriented  data  model, 
which  is  discussed  in  Chapter  4. 

In  Chapter  3,  we  introduce  a  notion  of  C-class  that  formalizes  schema  objects.  A  ( '-class  is 
a  construct  that  expresses  a  unit  of  real  world  knowledge.  As  mentioned  earlier,  in  conventional 
models  such  as  [AI\  89],  [KW  89],  those  units  of  knowledge  are  expressed  by  complex  objects 
and  logical  formulas.  The  C-class  is  similar  to  class  in  the  usual  object-oriented  languages, 
such  as  Smalltalk,  and  CLOS  [WT  89].  A  C-class  is  a  combination  of  syntactical  expressions 
of  type  and  restriction  predicate,  the  type  specifies  the  structure  and  the  restriction  predicate 
expresses  integrity  constraints.  A  restriction  predicate  is  a  first  order  formula  with  implicitly 
typed  variables,  which  is  essentially  a  restricted  form  of  O-formula  [KW  S9].  We  also  int  roduce 
a  hierarchy  among  C-classes  to  express  the  hierarchy  of  knowledge. 

In  Chapter  4,  we  discuss  the  main  theme  of  this  report,  object-identity.  First  we  formalize 
a  value-oriented  model  of  C-clas$es.  Then  we  define  a  object-oriented  model  of  C'-<  lasses  by 
introducing  the  object-identity  space.  This  object-oriented  model  represents  the  deai  semantic 
distinction  of  a  value-oriented  model  and  an  object-oriented  model.  Furl  her  it  clarifies  t  he  role 
of  object-identity  in  the  knowledge  representation. 

In  Chapter  5,  we  consider  the  C-classes  in  detail  and  provide  some  kinds  of  C-classes.  It 
reveals  that  even  a  logical  representation  of  knowledge  cannot  be  captured  in  a  value-oriented 
paradigm.  We  discuss  which  knowledge  should  be  value  and  which  should  be  object  as  the 
database  design  issue.  We  introduce  the  concept  model  as  a  knowledgebase  model. 

In  Chapter  6,  we  demonstrate  the  expressibilitv  of  the  concept,  model,  by  simulating  the 
semantics  of  other  models,  such  as  datalog,  IQL(AK  89]. 

In  Appendices,  we  briefly  discuss  database  operations,  inheritance  and  overloading.  The 
semantics  of  database  operation  is  quite  simple,  especially  for  queries.  Furthermore,  we  provide 
the  copy  of  the  actual  session  performed  on  the  prototype  system  that  has  been  implemented. 


2  Data  Algebras 

We  introduce  a  notion  of  data  algebra  to  express  instances  of  schema  objects.  The  notion  of 
flata  algebra  is  an  abstract  formalization  of  complex  objects  with  integrity  constraints,  which 
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will  serve  as  a  value-oriented  model  of  schema  objects  later.  We  assume  a  basic  knowledge  of 
the  universal  algebra,  as  found  in  p.22  -  p.60  in  [BS  81]. 

2.1  Multi-valued  Universal  Algebra 

In  order  to  define  the  notion  of  data  algebra,  we  provide  a  precise  definition  of  mull i- valued 
Junction ,  partial  function,  and  an  extended  universal  algebra.  If  the  reader  does  not  like 
mathematical  details,  he/she  may  read  only  the  last  paragraph  ol  this  section. 

Let  /I  and  B  be  sets,  and  let  2  4  and  2s  be  the  power  sets  of  .4  and  B  respectively.  Then, 
a  function  from  2-4  to  2B  is  called  a  multi-valued  function 4  from  .4  to  B,  if  it  satisfies  the 
following  condition. 

V/7  €  24,  f(U)  =  (J  /({: r}). 

xeu 

We  denote  the  multi-valued  function  as: 

f::A  ~  B. 

It  is  easily  proven  that  the  composition  of  multi-valued  functions  is  also  a  multi-valued 
function.  Namely, 

/::.4  —  B.gv.B  —  C  =>  gofy.A  —  C'. 

A  multi-valued  function  /  is  called  total  if 

f::A-B,  Vxfd,  /({x})  #  0. 

Note  that  we  can  construct  a  category  consisting  of  sets  as  objects  and  multi-valued  functions 
as  morphisms.  The  identity  multi-valued  function  id, 4  on  a  set  .4  is  the  identity  function  on 
2  4 . 

For  a  multi-valued  function  /  from  .4  to  B ,  we  can  always  define  the  quasi-inverse  function 
f~l  from  B  to  A. 

VV€2fl,  f~l(V)d=J  {x  €  -4  |  ( /( x )  n  V)  ±  0). 

A  multi-valued  function  /  from  A  to  B  is  called  injective  if  /_1o/  equals  id 4.  The  function  / 
is  called  surjective,  if  /o/-1  equals  idg. 

A  partial  function  f  from  A  to  B  is  a  multi-valued  function  from  A  to  B  such  that  for  each 
element  of  .4,  the  cardinality  of  its  image  is  no  more  than  one, 

Vx  £  .4,  card(f({x}))  <  1. 

The  domain  d{f)  of  a  partial  function  /  is: 

d(f)  =  {xe  -4|/({.r})^0}. 

Any  function  h  from  A  to  B  can  be  regarded  as  a  multi-valued  function.  Namely,  we  can 
define  a  multi-valued  function  h  by: 

vr  €  2-4.  h(U)  =  {/(. r)|.r  e  V}\ 

4 A  multi-valued  function  from  4  to  B  is  equivalent  to  a  binary  relation  on  ,1  x  B 

■'The  operator  can  be  considered  as  a  functor  from  tin*  category  of  sets  to  another  category  consist  mg  of  -  t-  1- 
objects  and  multi  valued  fuu<-tions  as  morphisms 
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In  the  rest  of  this  report,  we  use  the  following  simplified  notation  so  long  as  it  cause;,  no 
confusion.  For  a  multi-valued  function  from  .4  to  13,  for  an  element  x  of  A,  and  y  of  B. 

f(x)‘=  f{{x}),  (f{x)  =  y)dU  ( /( {x } )  =  {y}). 

Moreover,  we  introduce  a  virtual  element  t'j_  to  express  “undefinedness”,  which  is  called  the 
null  value.  The  null  value  t'x  is  a  common  element  of  all  sets.  For  a  multi-valued!  partial) 
function  /,  we  denote 


/({*})  =  0. 

Now  we  extend  the  notion  of  universal  algebra.  A  multi-valued  universal  algebra  A  is 
a  pair  of  a  set  .4  and  a  family  { /, } ,e /  of  multi-valued  functions.  All  the  notions,  such  as 
homomorphism,  isomorphism,  are  redefined  using  multi-valued  functions  instead  of  functions. 
Similariy,  a  partial-valued  universal  algebra  is  a  multi-valued  universal  algebra  such  that  all 
the  functions  are  partial. 

The  notion  of  data  algebra  is  defined  by  multi-valued  universal  algebras.  However,  in  order 
to  make  the  discussion  simple,  we  only  consider  partial- valued  universal  algebras  in  the  rest 
of  this  report.  The  reader  can  consider  the  partial- valued  universal  algebra  as  usual  universal 
algebra,  except  for  the  existence  of  null  value.  Hence,  we  use  the  term  “universal  algebra" 
instead  of  "partial-valued  universal  algebra"  from  now  on.  But  readers  should  remember  that 
functions  are  partial. 


2.2  Definition  of  Data  Algebra 

In  this  section,  we  provide  the  definition  of  the  data  algebras.  A  data  algebra  b  is  a  pair  of  a 
universal  algebra  6  A  and  a  restriction  function  ‘ .  Namely, 

6  =  ( A,  r),  A  =  (A,  {/,},€/),  r  :  A  —  2, 


where  2  is  the  two-element  boolean  algebra. 


2  =  ({0,1}.  A.  V. 


Further,  we  assume  that  each  data  algebra  contains  a  special  element  null  value  As 
mentioned  before,  the  null  value  expresses  “undefinedness.”  For  each  function,  if  one  of  the 
arguments  is  null  value  then  its  value  is  also  null  value. 

A  data  algebra  is  the  abstraction  of  a  collection  of  data  with  operations  on  it.  For  example, 
“positive  numbers”  would  be  expressed  by  a  data  algebra: 


(R,  r),  r(x) 


tie  / 


1  (  ,r  >  0) 
0  (  x  <  0), 


where  R  is  the  universal  algebra  of  real  numbers. 

'AVe  should  remember  that  this  universal  algebra  is  a  partial-valued  universal  algebra  defined  in  the  previnu- 
section.  We  can  regard  it,  as  if  it  was  a  usual  universal  algebra  b\  introducing  a  null-value  as  a  common  valm  ■  U 
all  universal  algebra. 

'More  precisely,  r  is  a  function  on  the  domain  of  A  However,  we  describe  it.  as  a  function  on  A  SimilaiK 
throughout  tins  report .  we  t  real  A  and  its  domain  interchange;  thly  so  long  as  t  lie  meaning  is  clear  for  example  !  i 
give  universal  algebras  A.  B.  we  would  state  something  like  "a  mapping  from  A  to  B”  The  meaning  is  '  a  mappm- 
from  a  domain  of  A  to  the  domain  of  B." 


pn<j< 


2.3  Fundamental  Operators 

We  introduce  operations  among  data  akebras.  These  operations  will  provide  the  interpreta¬ 
tions  of  fundamental  operations  on  C-classes.  which  will  be  introduced  in  Chapter  3.  Each 
operation  happens  to  have  a  corresponding  construct  in  relational  algebra  or  SQL.  However, 
we  should  note  ' hat  these  operators  have  not  been  obtained  by  a  mere  extension  of  relational 
algebra,  but  by  the  consideration  of  knowledge  representation,  as  their  name.-,  suggest.  We 
could  say  that  one  of  the  reasons  of  the  success  of  relational  model  is  due  to  the  fact  that 
the  relational  operations  have  a  correspondence  to  a  higher  level  of  mental  processes,  such  as 
abstraction  of  concepts.  This  will  be  clear  when  we  introduce  the  fundamental  operators  on 
(‘-classes  in  Chapter  3. 

2.3.1  Aggregation 

The  aggregation  operator  constructs  a.  complex  structure  out  of  data  algebras.  It  is  similar  to 
cartesian  product  operator  in  relational  algebra. 

Let  <I>  be  a  set  of  symbols,  and  let  a  be  a  mapping  from  <I>  to  a  set  of  data  algebras. 

o(/)  =  Sj  =  (Aj'Tj)  (/  €  <H 

Then  the  aygrrtjation  of  {<Q}/g<t>  is 


( n  a/’  a  °  p>) ) 

/€< t»  /€* 

where  p,  is  the  projection  from  A,  to  A,,  and  o  designates  a  composition  of  mappings. 
We  denote  the  aggregation  by 

or  Sf. 

/€<t 

In  particular,  if  <!>  is  equal  to  {1 , .  . . ,  n},  we  denote  it 

re. 

i=i 

Moreover,  if  <*>,  is  equal  to  a  data  algebra  S  for  each  i  (I  <  i  <  »),  we  denote  Sn  instead  of 
f]r=i  V  Furthermore,  if  we  write  the  aggregated  data  algebras  as: 

S"x6,  S"xSxS', 

for  given  data  algebras  S,  S',  6”.  etc.  it  means  that  we  are  assuming  the  following  implicit 
sequencing, 

«( 1)  =  S" ,  a(2)  =  S,  a(3)  =  S' . 


2.3.2  Recursive  Aggregation 

In  order  to  provi  'e  an  algebraic  model  for  recursive  types,  we  introduce  i<cur*tv(  (iijijieyation. 
Let  (1  be  a  directed  graph  with  nodes  V  and  labeled  edges 


C,  =  (\\  E). 


We  denote  an  element  of  E  by  (n,  m,l).  which  means  that  there  is  an  edge  front  u  to  m  labeled 
by  /.  further,  let  a  be  a  mapping  from  (  to  a  set  of  data  algebras,  where  V  is  the  subset  of 
l  such  that  each  element  u  of  U  doesn't  have  any  edge  that  comes  into  u, 

[  =  { u  £  \  |  ->(  3 1’3/  (  v.  ii,  /)  E )}. 

n(n)  =  (B„.s„)  (n  t  /'). 

Then,  the  recursive  aggregation  with  respect  to  (i  and  a  is  defined  as  follows. 
n(<T.n)  =  (finer  A n-  Antr(r»  °  ~n)  ) 

Vn  €  r,  A„''=  (  5"  .  ,ne(:1 

I  I  I(n,m./)e£  A(m  J)  (  ?i  ^  f  ) 

V(n.m.()  €  E  Am,j  =  Am 


Vn  t  I',  rn(i) 


1  (l€tn(f)) 

0  (otherwise) 


The  functions  tn  are  multi-valued  functions  from  A., 


tn  = 


f/  /  A(n,m,t)g£(sm07t(m,/))  ( n  ^  t  ) 

[  sn  (otherwise) 


to  2.  which  is  defined  as  follows. 


where  sm(j) 


1  (r  =  ri) 

tn(i)  (otherwise) 


Note  that  the  elements  of  A n(n  ^  U)  have  an  infinite  structure  in  general.  We  may  regard 
those  elements  as  infinite  trees.  However,  since  we  allow  null  value  as  the  common  element 
ot  every  algebra,  we  can  express  elements  with  a  finitely  recursive  struct’- -e.  The  function  tn 
is  well-defined,  if  the  recursive  definition  assigns  consistent  values  to  each  subtrees.  Although 
the  restriction  function  r  is  a  partial  function,  it  is  well-defined  on  the  elements  with  finite 
structure  and  cyclic  structure.  The  aggregation  defined  above  is  a  special  case  of  the  recursive 
aggregation.  In  fact,  if  we  assume: 

V  =  <t>,  E  =  0,o(/)  =  Ay(/€  *), 

we  get  the  original  aggregation  operator. 

2.3.3  Abstraction 

The  abstraction  operator  constructs  a  new  data  algebra  ignoring  some  of  the  substructures  of 
a  data  algebra.  It  is  similar  to  the  projection  operator  in  relational  algebra. 

For  /  €  <t>.  let  A/  be  a  universal  algebra,  where  •!>  is  a  set  of  symbols,  and  let  be  a  sub.-wt 
of  <I».  Let  us  consider  the  following  data  algebra  d 

*  =  (  II  A  / .  r ) . 

/€< J> 


put/i  I 1 


where  []/£$  A/  is  the  product  algebra  of  {A/}/g<j>.  Further,  let  Py  be  the  projection  from 
n /e<*>  A/  to  ripe*  A5.  The  abstraction  T(<\T)  of  the  data  algebra  8  with  respect  to  ^  is 
defined  as: 

T(«,¥)  =  (I]  As’  r), 

ne'i 


where 


r(x) 


1  (if  3 ij  e  t\\x).r(y)  =  1) 
0  (otherwise) 


2.3.4  Restriction 

The  restriction  operator  imposes  a  new  restriction  on  a  data  algebra.  It  is  similar  to  the 
selection  operator  in  relational  algebra. 

Let  8  =  (A.r)  be  a  data  algebra,  and  let  s  be  a  mapping  from  the  domain  of  A  to  2. 
Then  the  restriction  with  respect  to  s  is 

( A.  r  A  s) 


The  restriction  is  denoted  by  0(8.  s). 


2.3.5  Sequence  Construction 

The  sequence  construction  operator  constructs  a  data  algebra  consisting  of  sequences  of  ele¬ 
ments  of  a  data  algebra. 

Let  8  =  (A,r)  be  a  data  algebra.  The  sequence  algebra  Seq(<5)  derived  from  8  consists 
of  the  direct  sum8  of  the  product  algebra  An  ( i  —  0,1,2,...),  and  the  relevant  restriction 
function  rse(J, 


Seq(<5)  =  (DijT0An  ,  vseq) 

V.r  =  (xi.x2,  •  - -,x„)  £  Seq(8),(n  =  0,1,2 - ) 


Given  a  class  of  universal  algebras.  The  set  of  finite  sequences  of  elements  of  the  algebras 
in  the  class  forms  a  universal  algebra  with  functions,  length ,  concatenate ,  null ,  reverse,  etc. 
We  designate  it  by  SEQ.  We  assume  that  the  direct  sum  L^=0An  is  a  subalgebra  of  SEQ  by- 
embedding  it  in  SEQ. 

2.3.6  Bag  Construction 

The  bag  construction  operator  constructs  a  data  algebra  consisting  of  bags  of  elements  of  a 
data  algebra. 

Let  8  =  (A.r)  be  a  data  algebra.  We  can  define  a  congruence  relation  ~  in  the  direct  sum 
algebra  A"  as  follows.  For  elements  x.y  of  Seq( , 

■r  =  (x\ . £„),  jj  =  (>J l . //ml- 

8The  direct  sum  is  always  a  partial-va  led  algebra. 


the  sequences  x  and  y  are  equivalent  with  respect  to  x  ~  y.  if  n  equals  in.  and  there  exists 
a  permutation  a  of  order  n  such  that 

(Xi, =  (X(T(l),.  ..,!*(„)). 

Then  the  hay  algebra  derived  from  h  consists  of  the  quotient  algebra  of  E^_„An  with  respect 
to  ~  and  the  restriction  function  r bag.  Since  the  restriction  function  rse7  of  Seq(C  ha*,  the 
same  value  on  the  equivalence  class  of  we  can  define  the  restriction  function  of  Bag(rs) 
by: 

r6aj([x])  r  3e7(x), 

where  [x]  is  the  equivalence  class  with  respect  to  ~  containing  x.  Similarly,  wo  can  construct 
a  universal  algebra  BAG  as  a  quotient  algebra  of  SEQ.  As  in  the  definition  of  Seq(d).  we 
assume  that  the  algebraic  part  of  Bag(<*>)  is  a  subalgebra  of  BAG  by  embedding  it  in  BAG. 

2.3.7  Set  Construction 

The  set  construction  operator  constructs  the  data  algebra  consisting  of  finite  sets  of  elements 
of  a  data  algebra. 

Let  h  be  (  A.r).  The  set  algebra  Set( A )  is  the  collection  of  finite  elements  of  A  that  satisfies 
r.  The  definition  is  as  follows.  First,  we  define  a  restriction  function  s  on  Bag( <S ) .  We  denote 
an  element  of  Bag(<*))  by  [x],  where  x  is  an  element  in  Seq(<5).  Then. 

x  =  Ul,  *2 . In) 

c/fyti  d±!  /  1  (*  7 ^  i  =>  x,  ^  Xj)  ) 

L"  |  0  (otherwise) 


Next,  let  SET  be  the  universal  algebra  of  finite  sets  with  functions  U(union).  f~i( intersection ). 
-(difference),  etc.  Then  set  algebra  of  Set{6)  is  obtained  from  ©(Bag(^).s)  by  regarding  its 
algebraic  component  as  subalgebra  of  SET.  Namely, 

0(Bag(<Q,s)  =  (E~  0An/~.  r6a9  A  s). 


2.3.8  Categorization 

The  categorization  operator  constructs  a  new  data  algebra  by  categorizing  elements  of  a  data 
algebra  with  respect  to  the  values  of  some  substructures.  It  is  similar  to  the  grouping  const  met 
of  SQL  without  aggregation  functions. 

Let  $  be  a  set  of  symbols,  let  be  a  subset  of  $  and,  let  be  the  complement  of  *1', 

C  <f>.  'f»c  =  $  -  'I'. 

Further  let  1  lie  a  mapping  from  $  to  a  set  of  universal  algebra.  Now  let  us  consider  the 
following  lata  algebra  A 


*  =  (El 

/e<t> 


Then  the  <  (ilfijanzution  1  l(<‘i,vl')  of  b  with  respect  to  T  is: 
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ft(S,  tf)  =  Q(T(C  ¥)  x  Set(T(<5,  ¥"')),  rn). 


The  restriction  function  rn  is  defined  as  follows. 


rn(U,2/)) 


I  (V:6ji,r(:rc-;!=  1 ) 
0  (otherwise). 


where  for  (ij . x„)  in  fl/e-J-  /?(/)•  (l/i . !Jm)  in 

n)$(!/i, - I/m)  -  {x  1, - xn.yi - )jm  )  €  n^(/)- 

/e<t> 


2.4  Many-Sorted  Data  Algebra 

So  far,  we  have  introduced  the  notion  of  data  algebra  based  on  universal  algebras.  In  this 
section,  we  extend  the  notion  to  many-sortcd  universal  algebras  instead  of  universal  algebras. 

First  we  consider  a  manv-sorted  algebra  with  sorts  S.  For  a  sort  s  in  S,  let  us  denote  the 
universal  algebra  of  the  sort  s  by  As,  and  let  A(s)  be  the  collection  of  all  subalgebra  of  As. 
Further  let  A(S)  be  the  closure  of  U3gs>I(s)  with  respect  to  the  cartesian  product  operator. 
A  set  of  data  algebras  D  is  the  many-sorted  data  algebras  with  sorts  5\  if 

V<5  €  D  6  =  (A,r),  A  €  A(S). 

We  call  the  data  algebra  of  the  following  form  as  the  primitive  data  algebra  of  sort  s. 

6  =  (A„  r)  (s  €  S). 

We  assume  that  any  primitive  data  algebra  of  sort  s  will  never  be  derived  from  primitive 
algebras  of  different  sorts  wuth  fundamental  operators.  Namely,  any  data  algebra  of  the  form 
(A3,  r)(s  £  S)  will  never  be  derived  from  another  data  algebras  of  the  form  (A's,  r)  (s'  € 
.V  -  {.s})  with  fundamental  operators. 

From  now  on,  we  assume  that  data  algebras  are  constructed  on  a  many-sorted  algebra, 
even  if  the  sorts  5  is  not  specifically  stated.  In  another  word,  data  algebras  are  generated  from 
primitive  data  algebra  in  the  sense  defined  in  the  next  section. 


2.5  Generated  Data  Algebra 

f  or  a  given  set  of  data  algebras,  we  can  generate  data  algebras  by  the  fundamental  operators, 
such  as  aggregation,  restriction,  abstraction  etc.  We  call  a  set  of  data  algebras  algebraic 
family  of  data  algebras  if  it  is  closed  under  these  operations.  For  a  set  D  of  data  algebras,  we 
can  consider  the  minimum  algebraic  family  of  data  algebras  that  contains  D.  We  call  it  the 
algebraic  closure  of  D  and  denote  it  as  D.  Since  the  intersection  of  algebraic  families  is  also  an 
algebraic  family,  it  is  obvious  that  there  exists  a  unique  algebraic  closure  for  any  set  of  data 
algebras.  In  fact,  the  closure  of  D  is  the  intersection  of  all  algebraic  families  that  contain  D. 

Conversely,  we  can  consider  the  minimal  set  of  data  algebras  that  generate  a  given  set  D 
of  data  algebras.  More  precisely,  we  can  consider  the  set  M  D)  of  data  algebras  such  that: 


1>(KJ< 

•  the  algebraic  closure  of  k(D)  contains  D. 

k(D)  D  D. 

•  among  the  sets  that  satisfy  the  above  condition.  k(D)  is  minimal.  Namely,  for  a  set  of 
data  algebras  E,  if 

E  D  D  and  k:(D)  D  E, 

t  hen 

D  =  E. 

It  is  not  difficult  to  prove  the  uniqueness  of  k(D)  up  to  isomorphism,  if  D  is  finite.  Namely, 
it  is  not  only  minimal  but  also  minimum.  So  we  call  it  the  kernel  of  D.  The  kernel  of  a  set  of 
data  algebras  will  provide  the  building  bricks  to  construct  the  data  algebras. 

2.6  Named  Data  Algebra 

The  notion  of  data  algebra  will  provide  a  structure  of  the  space  to  express  our  knowledge. 
However,  the  structure  itself  is  not  enough.  For  example,  we  can  express  "absolute  temper¬ 
ature”  and  "half  line”  by  the  same  data  algebra  as  "positive  numbers",  which  is  defined  in 
section  2.2  as  an  example.  Moreover,  we  don't  want  to  allow  operations  such  as: 

1°K  +  2  cm. 

Thus  we  need  to  distinguish  the  data  algebras  that  are  expressions  of  "absolute  temperature" 
and  “half  Line.”  Hence,  we  attach  names  to  all  algebras  to  distinguish  them.  We  don’t  allow 
algebraic  operations  between  the  elements  of  data  algebras  with  different  names.  A  named 
algebra  is  expressed  by  a  tuple: 

(n5,  As,  rs). 

In  the  rest  of  this  report,  we  assume  that  every  data  algebra  is  named.  However,  when  we 
don’t  have  to  consider  the  name  explicitly,  we  use  the  previous  notation  without  a  name. 

2.7  Hierarchy  of  Data  Algebras 

In  the  later  chapters,  we  will  see  that  data  algebras  play  the  role  of  model  of  a  knowledge 
representation  (schema  representation)  In  order  to  express  the  hierarchy  of  knowledge,  we 
introduce  mappings  among  data  algebras.  First  we  assume  that  there  exists  a  partial  order 
<  among  names  of  data  algebras.  If  n  <  n',  we  say  that  the  name  n  is  a  mibname  of  n1 .  A 
subtype  mapping  is  the  mapping  from  a  data  algebra  to  another  data  algebra,  which  is  defined 
as  follows. 

Let  us  consider  the  manv-sorted  data  algebra  on  the  sort  5.  Let  A  be  the  set  of  universal 
algebras  corresponding  to  the  sorts. 

M  =  {As|se  .S-J- 

Let  D  be  the  manv-sorted  data  algebra  on  5. 

D  =  {<*>  =  ( n6 ,  A,s .  r,0}. 

A  subtype  mapping  />  from  a  data  algebra  to  another  data  algebra  (s'  is  the  mapping  that 
satisfies  the  following  conditions.  Let  us  assume  that: 

t  =  ( n.  A.  r),  y  =  '  n' .  A',  r'). 


•  Case  1:  The  data  algebras  6  and  S'  are  primitive  algebras. 

-  The  name  n  is  a  subname  of  n'  and  the  algebras  are  the  same. 

n  ■<  n  and  A  =  A', 

-  The  restriction  function  r  is  stricter  than  r’  and  the  p  is  the  inclusion  mapping. 
Namely, 

Vi  €  A,r(i)  =  1  =>  r'(i)  =  1, 

0(p)  =  {x  G  A|  r(i)  =  1  }. 

Vi  €  d(p),  p(x)  -  x. 

•  Case  2:  The  data  algebras  b  and  S'  are  compound  algebras: 

S  =  ( n.  £[  A„  r),  S'  =  (? T,  A',,  r'). 

i£<t> 

-  The  name  n  is  a  subname  of  n';  tl  <  n' . 

-  1’he  attribute  is  a  subset  of  C  4>. 

-  For  each  /  in  <£',  there  exists  a  subtype  mapping  pj  from  ( Ay,  tTj(  r ))  to  ( Ay.  ly'(r') ). 
where 

~lrur)_j  1  (if  A  (r(y)  =  1)) 

/l  M  ;-\0  (otherwise), 
try  is  the  projection  from  IT3e<i>Ag  to  A/. 

Similarly  for  7r/(r'). 

-  Let  If  be  the  projection  from  II/e<f  A/  to  ri/e<j>-  A/.  Then, 

Vi  €  ri/€$  A/,  r(z)  =  1  =>  r'(( tt )  o  flfrl)  =  1 . 
where  trye<j,,py  is  the  product  mapping: 

Vi  €  II/€<i><  Vjf  G  irg((7TjevPf)(.v))  =  /)/(",( .r  i ). 

-  The  subtype  mapping  p  from  S  to  S'  is  defined  by: 

P  -  (*fe*'Pj)°  n. 

We  sav  that  <5  is  a  subtype  algebra  of  S'  if  there  exists  a  subtype  mapping  from  S  to  S'. 

If  there  exists  a  subtype  mapping  from  6  to  6',  S'  will  be  a  model  of  a  move  general  concept 
than  the  concept  that  has  the  model  S.  We  will  discuss  it  precisely  in  Chapter  i. 


3  C-Classes 

In  order  to  formalize  the  construct  of  schema  objects,  we  introduce  the  notion  of  ('-class'. 
First  we  define  the  structure  of  C-classes. 

'C-r lass  is  a  kind  of  class.  The  letter  C  m  "C-class”  is  intended  to  suggest  coiiccji/ 


/""/f 

3.1  C-Class  Construct 

The  set  T  of  C-classes  is  defined  as  follows. 

3.1.1  Definition  of  C-Classes 

r  =  {7 1 7  =  <iv  t’-,.  7;,  a.,.  /?-,) }. 

The  intended  meaning  of  symbols  is: 

•  The  name  11-,  of  7  is  a  symbol  that  designates  the  name  of  the  C-class  7.  The  symbol  is 
unique  to  each  C-class. 

•  The  attributes  set  of  7  is  a  set  of  function  symbols  that  designate  attribute  names. 

•  The  attribute  value  of  7  is  a  mapping  from  <f>-,  to  the  set  of  C-class  names  in  T. 

•  The  structural  sentences  T of  7  are  a  set  of  sentences  that  define  the  algebraic  structure 
of  a  universal  algebra,  which  specifies  the  structure  of  the  representation. 

•  The  auxiliary  sentences  A-,  of  7  is  a  set  of  sentences  that  defines  new  functions  and 
predicates  concerning  7.  A-,  is  used  to  simplify  the  expression. 

«  The  restriction  formula  of  7  is  a  well-formed  formula  with  one  free  variable.  This 
formula  specifies  a  subset  of  the  domain  of  the  universal  algebra  defined  by  Ty.  It  is  the 
restriction  condition  on  the  domain. 

The  above  construction  provides  a  language  for  conceptualization  of  the  real  world.  But 
we  should  keep  in  mind  that  our  conceptualization  is  always  incomplete.  Since  any  object  in 
the  real  world  has  almost  infinitely  many  attributes,  our  conceptualization  of  the  object  will 
be  only  an  approximation.  We  should  distinguish  between  “real  conceptual  world”  and  “our 
conceptualization.”  The  real  conceptual  world  is  the  complete  conceptualization  of  the  real 
physical  world.  In  the  real  conceptual  world,  a  concept  can  be  characterized  by  the  set  of 
attributes.  Namely,  any  two  distinct  concepts  have  different  sets  of  attributes.  However  our 
conceptualization  may  not  be  complete,  t  wo  distinct  concepts  may  be  expressed  with  identical 
attributes.  Therefore  we  need  C-class  names  to  identify  each  distinct  concept.  (It  is  true  that 
we  can  carefully  choose  attribute  names  so  that  any  distinct  concepts  are  expressed  with 
different  attributes  in  our  conceptualization.  However,  it  becomes  fairly  difficult  to  design 
schema  in  such  a  way,  if  the  schema  is  big.  Moreover,  if  the  schema  will  change  in  the  course 
of  time,  the  maintenance  of  consistent  attribute  names  will  be  much  more  difficult.) 

3.1.2  Examples  of  C-classes 

W  e  use  the  prefix  notation  for  ,  >  f'tc.,  instead  of  the  conventional  infix  notation.  The 
only  exception  is  equality  =. 

•  Integer  In  our  model,  we  treat  integers  as  the  instances  of  a  C-class. 

/ ntrger  =  (integer,®,  ±,TintegfT,  A  /  TltPlJPT  •  1  It  L  L'y  )  . 
where 

Tin trtjrT  —  { Vz  ^y  -I-  ( x ,  y )  —  ~\~( y .  .1  ) . 

Vr  Vy  Vz  +  (x,  +(y.  :))  -  +( -M  r.  //).  r). 
etc.}. 


-^Integer  =  {Vi  Positivf(x)  =  >(./'.  0  ).  C  tc.  }  . 

•  People 

People  is  conceptualized  by  name  and  age  in  this  example. 

Person  =  [person.  {name ,  age} ,  Tperson.  Apersc,n,  /?per50rl). 

(name)  =  siring,  vp(rson(age)  =  integer, 

where  string  and  integer  designate  the  C-classes  that  have  algebraic  structure  of  strings 
and  integers. 

1  Ver son  —  {  V.iT(  name(x),  string)  A  T(age(x).  integer), 
y.r\/y  name(  modi f  y(x ,  person,  y))  =  y, 

V.cVf/  <ige(modif  y(x. person,  y))  =  y}. 

where  modify  designates  the  function  that  modifies  the  attribute  values  of  C-classes. 

A,, e, =  {V.r  OldPerson(  v)  <=>  age(x)  >  GO.  etc}. 

=  ((0  <  «</t(x)  <  200)  A  . . .). 

•  Rational  Numbers 

The  structured  values,  such  as  rational  numbers,  are  also  expressed  by  instances  of  a 
C-class.  The  expressions  of  rational  numbers  are  expressed  by: 

Rationales  —  (  lutiomil ,  {nwni,  den  },  Vfta{tonai,  yRationali  ^Rational'  ^rational)* 
where 

URat>on<il(num)  =  VRal,.nat{den)  =  integer , 
y Rational 

=  {V«V6  num(a)  =  .r  A  num(b)  =  u  A  den(a)  =  y  A  den{b)  -  v 


num(  +  (a,b))  =  +  (*(x .  v).  *(u,  y))  A  rfen(+(a,  6))  =  *{y,v). 
etc.}, 

■^Rational  =  {Vx  I  avert  ible(x )  =  ^(num(x)  =  0),efc}. 
yirritton<il(  )  —  '(  df  11  (X)  —  0). 


•  Set.of  Jnteger 

A  et  of  a  concept  is  expressed  as  a  C-class  without  attributes.  We  assume  that  a 
predicate  symbol  T  is  provided  to  designate  the  instance-class  relation.  We  also  assume 
that  each  set  C-class  has  a  standard  predicate  In.  such  that  In(x,y)  means  x  is  in  a  set 
y.  We  will  extend  this  example  to  a  general  case  later. 

Set.of  Jnteger  =  (set. of  .integer  .0,  1,  TS'U  ±Set^f -InUq'r  .  Rset  .Integer  ). 

Cs7(  =  { U(  .r.  y)  =  U(  r/ .  .)• ).  D(  ,r.  U( y,  :))  =  U(n( x,  y).  n( x,  z)),ete.}, 

Rset-oj.lntrgrrix)  -  (Vy  In(  y.x)  =>  T(  y.  integer)). 

■^S'l-of-lni'-ger  ~  {  "r  0 u c .Element (x) 

\  Vi/Vr  In(  y,  x )  A  Inf  c.  x  )  =>  y  =  :).etc.}. 


In  above  examples,  we  have  introduced  relation  symbols  T  and  In.  From  now  on.  we  assume 
these  symbols  are  part  of  the  basic  construct  of  ('-classes. 
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3.1.3  Primitive  C-Classes 

Lot  us  consider  a  concept  with  some  attributes.  We  may  say  the  concept  is  constructed  by 
the  concepts  that  are  attribute  values  of  the  concept..  To  formalize  this  intuition,  we  impose  a 
condition  on  the  structural  sentence  P-,  of  the  C-class  with  non-empty  attributes.  If  it  is  not 
specially  declared,  we  assume  that  any  C-class  with  non-empty  attributes  has  the  structural 
sentences  /-,  containing  the  following  sentences  7'^. 


Lot  <1>^  be  {/, . /„}.  then 

T-!  =  {VxVyf,[modify(x.fl,y))  =  y\i=  1 . n  }  U 

{V.r  T (/,(*),  r-,(/J)  |  i  =  1 . n} 

The  function  symbol  modify  designates  the  function  that  modifres  the  attributes  of  C-classes. 
The  typical  model  of  the  sentences  T®  is  the  cartesian  product  of  the  attributes  specified  by 
Hence  all  the  C-classes  are  constructed  out  of  its  attribute  C-classes.  if  their  attributes 
are  not  empty.  In  this  sense,  if  a  C-class  has  no  attributes,  we  cal!  it  a  primitive  C-class.  A 
C-class  that  is  not  primitive  is  called  compound  C-class. 

In  the  last  example  of  Section  3.1.2.  we  have  shown  that  the  set  of  integers  is  expressed 

as  a  primitive  C-class.  Later,  we  will  extend  this  example  to  express  the  set  of  any  C-class 

as  a  primitive  C-class.  This  may  seem  a  little  bit  strange,  because  it  contradicts  the  term 

"primitive.''  It  may  be  considered  that  the  set  of  a  C-class  should  be  formalized  as  something 

complex.  We  use  the  term  “primitive”  meaning  “structureless.'’  In  a  model  theoretic  sense,  a 
set  C-class  is  structureless,  the  operations  that  are  allowed  to  them  are  the  standard  union, 
intersection,  etc.  There  is  no  algebraic  operation  that  accesses  its  “sub-structure." 

3.2  Universal  Language 

Since  the  description  of  concepts  is  essentially  local  to  each  concept,  there  may  be  inconsis¬ 
tency  in  the  name  of  function  symbols  and  relation  symbols.  For  example,  a  person  can  be 
conceptualized  bv  a  C-class  Person : 

Person  =  ( person .  {name,  address) ,  Vpfr)on,®,  Q.TR.EE). 

On  the  other  hand,  a  subconcept  Student  of  Person  may  be  expressed  by  a  C-class  Student 

Student  =  ( student ,  {s.name.  residence}.  rs(uiien(,  0.  0-  TRUE). 

In  this  case,  s.name  and  residence  are  intended  to  express  the  name  and  the  address  of  the 
student  respectively.  So,  in  order  to  designate  the  intended  equivalence  of  these  symbols,  we 
need  a  common  language.  We  call  this  common  language  universal  language  of  F.  Later,  we 
need  the  common  language  to  define  the  hierarchy  of  the  concepts.  The  precise  definition  is 
as  follows. 

3.2.1  Universal  Renaming 

In  oidei  to  describe  the  correspondence  of  attribute  names  of  C-class  descriptions,  we  define 
the  notion  of  n  naming  as  follows.  For  i  being  1  or  2.  let  L,  be  a  first  order  language  made  ol 


set  V,  of  variables,  set  J F-  of  function  symbols  and  set  R,  of  predicate  symbols.  A  if  naming  a 
from  Li  to  L2  is  a  collection  of  injective  mappings  from  Vj  to  IN .  :Fl  to  /N  and  Rx  to  R>: 

a  =  (a„,a;,ar),  Qt, :  V'i  Vh,  o  j  :  fx  — •  JF>.  <\r:  Rx  — -  R2, 

such  that  it  preserves  the  similarity  types  of  function  symbols  and  predicate  symbols.  Namely, 
if  a  function  symbol  /  has  n  arguments,  ct/(f)  also  has  n  arguments.  Similarly  for  predicate 
symbols.  Note  that  the  renaming  a  induces  a  injective  mapping  from  /. ,  to  L2- 

Let  L[~i )  be  the  language  generated  by  the  symbols  of  the  description  of-..  Then  a  language 
L  is  the  unii'f  rsal  language  of  F,  if  there  exists  a  set  .V  of  renamings  such  that: 

-v  =  Kb€  rj 

V)  €  r  os :/.(-/)  —  /,. 

In  a  practical  case,  we  may  require  that  the  symbols  of  the  same  intended  meaning  will  be 
mapped  to  the  same  symbol  in  the  universal  language.  In  the  above  example, 

Cl personi name)  —  O  student  1  S.Ua  me  ) , 

aper,on{addTess)  =  ctstudent{  residence). 

However  these  are  meta-conditions.  Theoretically  the  morphisms  .V  determines  the  semantics 
of  symbols.  If  we  have 

(Xper  3on{na.me)  =  (X  jtudent(r€sid€HC€  ) , 

it  means  that  the  ‘name’  of ‘person’  has  the  same  semantics  as  ‘residence’  of  ‘student’,  although 
it  is  different  from  the  common  meaning  of  the  words  ‘‘name’’  and  “residence."  The  set  .V  of 
renamings  is  called  universal  renaming  of  T. 

3.2.2  Local  Renaming 

In  the  actual  programming,  it  is  difficult  to  describe  the  global  semantic  equality  from  the  be¬ 
ginning.  We  can  only  specify  the  semantic  equality  locally,  i.e.  we  only  provide  the  renaming 
between  the  description  languages  of  C-classes.  In  the  above  example,  we  may  provide  the 
renaming  astudent, Person  from  L(Student)  to  L(Person).  When  we  have  provided  renaming 
between  the  description  languages  of  individual  concepts,  we  expect  that  there  exists  a  uni¬ 
versal  renaming,  which  is  compatible  with  those  renamings.  Before  considering  the  existence, 
we  introduce  the  conditions  that  those  locally  defined  renamings  should  satisfy. 

Let  (!  be  a  subset  of  T  x  T,  and  let  J  be  the  set  of  injective  renamings  among  L( y)’s,  such 
t  hat 

J  =  {<*->,. a?  I  ^01,-72  :  £(7i)  —  Mr*)  (71,72)  G  "J, 

We  call  ((!..])  as  the  semantic  local  renaming  of  T  if  the  following  conditions  are  satisfied. 

1.  Transitivity 


Oly^yl  ,  £  J  (  0-y  y"  £  j  A  Oty  yt<  —  Ot  y  '  y"  O  Ct  y  y  t  ), 

where  c  is  the  composition  of  mappings. 

2.  Route  Independence 
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(7,7i),  (7,72),  (7i- 7')>  (72-  7')  €6’  =>  «^liV  o  a^>7,  =  ,v  o  a, . 

3.  Acyclicity  The  binary  relation  G  has  no  cycle. 

The  first  condition  expresses  the  global  semantic  compatibility  of  the  morphisms.  If  a 
symbol  s  is  semantically  equivalent  to  a  symbol  s'  and  s'  is  equivalent  to  s" .  then  s  should  be 
equivalent  to  by  the  transitive  rule  of  equivalence  relation.  The  second  condition  designate.-, 
the  consistency  of  the  inherited  attributes.  The  third  condition  describes  the  relevant  structure 
of  a  hierarchy. 

Note  that  we  can  eliminate  the  first  condition.  In  fact,  the  second  condition  guarantee-, 
that  we  can  extend  ( G,J )  to  another  semantic  local  renaming  ( C!\J ')  so  that  G"  is  transitive. 

3.2.3  Existence  of  Universal  Language 

If  we  have  a  local  renaming,  there  exists  a  universal  language  and  universal  renaming  such  that 
the  universal  renaming  is  compatible  with  the  given  local  renaming,  under  a  certain  condition. 
Let  us  define  a  partial  order  on  L  by  the  binary  relation  G. 

7  -G  7'  ^  (7,7')  €  G. 

Theorem  1  Let  T  be  a  set  oj  concepts,  and  let  (G,  J)  be  a  semantic  local  renaming  of  I\  If  G 
is  at  most  countably  infinite,  and  T  has  the  finite  minimal  elements  with  respect  to  <q.  then 
there  exists  a  universal  language  L  and  the  universal  renaming  .V  of  T  to  L,  such  that 

O-Y.-y'  €  -Z  =>  O',  =  O  CX-y  -yl, 


where 

y  =  {«->l7  €  r}. 

3.3  Fundamental  Operator  on  C-Classes 

In  order  to  construct  complex  (.'-classes  out  of  given  C-classes,  we  define  several  operations 
on  C-classes.  These  operators  are  some  abstraction  of  the  mental  process  of  human  beings  to 
create  new  concepts  out  of  existing  concepts.  These  fundamental  operators  correspond  to  the 
fundamental  operators  for  data  algebras.  In  fact,  the  fundamental  operators  on  data  algebras 
will  provide  the  models  of  the  fundamental  operators  on  C-classes. 

3.3.1  Aggregation 

For  given  C-classes,  we  can  create  a  new  C'-class  by  introducing  a  C-ciass  name,  attribute 
names  that  correspond  to  given  C-classes,  a  set  of  sentences  that  specifies  the  struct  ure  similar 
to  a  cartesian  product  such  that  the  attribute  names  are  designating  projections.  Let  7  be  a 
sequence  of  (.'-classes. 

I  —  (  /l,  72 . 7n), 

and  let  <!>  be  ,1  sequence  of  symbols  with  the  same  length  as  7, 


*  =  (/.-/2 . /„)• 


We  express  each  component  C-class  in  7  by: 

7.  =  (>h,  v,,  T„  A,,  Ri)  ( i  =  1 

Then  the  aggregation  Il(7tn,  7-  $)  of  7  is  defined  as  follows. 

H(»ih7.  $)  =  («n-  ?'n-  Tn,  An, fin) 

•  The  name  nri  of  the  aggregation  is  the  symbol  that  is  compatible  with  other  (  ’-classes. 
Namely,  the  symbol  never  appears  as  the  name  of  other  C-class. 

•  The  symbols  in  <f>  are  the  attribute  names  of  the  aggregated  C-class  1I( r?n .  -T.  $ ). 

•  The  attribute  value  rn  is  the  mapping  from  the  components  of  <l>  to  the  set  T  of  (’-classes, 
such  that 

1  <  Vi  <  n,  vn(ft)  =  1,- 

•  The  structural  sentences  Tn  is  similar  to  T°  for  a  C-class  7  with  non-empty  attributes. 

7'n  =  {V:r  Vr/  f,(modify{x,  f„y))  =  g  |  i  =  1  . .  .n)  U 

{V.r  T(/,(i).rn(/i))  i  *= 

’L'he  symbol  modify  is  the  function  symbol  for  the  modifier  of  attribute  values. 

•  l'he  auxiliary  sentences  may  be  any  definition  of  new  function  symbols  and  relation 
symbols  that  simplify  the  description. 

•  Each  component  of  the  aggregation  should  satisfy  the  restrictions  that  are  imposed  on 
the  attribute  value  C-classes.  The  restriction  predicate  fin  is  defined  by: 

n 

fin(z)  =  A 

1=  1 

The  aggregation  of  C-classes  has  a  model  that  corresponds  to  the  aggregation  of  data 
algebras,  which  was  defined  in  section  2.3.1.  This  will  be  d:scussed  later. 

3.3.2  Recursive  Aggregation 

Let  G  be  a  directed  graph  with  a  set  of  C-class  names  V  as  nodes  and  labeled  edges  E.  Let  V 
be  a  collection  of  nodes  in  V,  such  that  there  is  no  incoming  edge.  Further  let  W  be  a  subset 
of  V  that  contains  U, 

u  c  w  C  V . 

We  assume  that  for  elements  of  W ,  C-classes  are  given.  We  denote  an  element  of  E  as  ( tj .  m,g), 
which  designates  the  edge  front  n  to  ?,t  with  label  g.  Let  $  be  a  set  of  symbols  that  has  one 
to  one  correspondence  with  V . 

*  =  {fv  I  €  v  }. 

The  recursive  aggregation  1 1  ( 7/~,  G,  4* )  with  respect  to  G ,  $  and  IF  is  defined  as  follows. 

Hv\ (n-,G,$)  =  (npr<h.  i-.  / A- ,  fifj). 

•  The  symbol  e  -  is  a  new  C-class  name. 

•  Fhe  symbols  <l>  are  the  attribute  names. 
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•  The  attribute  values  are  provided  by  the  one  to  one  correspondence  of  $  and  V. 

Veer.  v~(fv)  =  v. 

•  The  structural  sentences  express  the  nested  structure  defined  by  G .  Let  V  be  ( i’i . r t); 

we  consider  V  as  a  sequence. 

'7f j  =  {Vxi  . . .  V.r/.  corns  pj(.rj . xfc))  =  -r,  |/=  1 . A-}  (J 

{VxT(/„(x),  e)|  f  e  r }  u 

{VxT(g(/u(x)),u)  j  (v,  u,g)  €  E  } 

•  The  auxiliary  sentences  include  recursive  definitions  of  restriction  predicates  for  compo¬ 
nent  C-classes. 

Apj  —  Ui/gV'— w' { Vx  Ilv(  j  )  o  ( (  x  =  r±  )  v  !  A( i/,ii,c;)6K  Ru( !l( r ) )  il } 

where  the  u±_  is  intended  to  designate  the  null  value  in  universal  algebra.  For  v  in  V  —  VV". 
the  Vth”  component  of  11(6',  <F)  is  a  6-class  with  recursive  structure. 

•  The  restriction  predicate  designates  that  each  component  should  satisfy  its  own  restric¬ 
tion  predicate, 

ftflU')  =  A  Rv(fv{*))- 
rtV 

3.3.3  Abstraction 

Let  7  be  a  C-class 

1  ( ^  "m  Ey ,  A-,,  Ii*y). 

and  let  be  a  subset  of 

*  =  {ffi . (Jm}  C 

The  abstraction  T(tit,7,  *P)  of  7  with  respect  to  ^  is  defined  as: 

Y(nTi  7,  )  =  ("  t-  At  ,  At-  /?  t  )  - 

Lire  definition  of  tit,  Tt,  and  At  are  similar  to  those  of  aggregation. 

•  n t  is  a  symbol,  which  designates  the  name  of  T(7!t  .  7. 'L  ). 

•  'F  is  the  set  of  symbol  that  designates  the  attributes  of  the  new  C-class. 

•  The  attribute  values  are  the  same  as  those  of  7, 

V3  €  ' H  ry(fj )  =  Vy(g). 

•  Ty  is  the  structural  sentence  defined  as  follows. 

At  =  {Vx  Vj/  g,(  modiftj(x,g,,y))  =  //  |  1  -  \  ...  m)  u 
{Vx  T( fl,(x  ),  Vy(g, ) )  |  1  -  1  ■  •  ■'),). 

•  The  restriction  relation  A?t  is  defined  as: 

Ai’r(x)  =  (3//  T(  //,  n.,  )  A  R-,(u)  A  (  A  ( <7  (  V )  =  !l(r))  )• 
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3.3.4  Restriction 

The  restriction  operator  replaces  the  restriction  formula  of  a  C'-class  by  ' he  conjunction  of  the 
original  restriction  formula’and  an  unary  predicate  lu.  For  a  C’-class  7, 

7  =  (/t-y.  <!>.,.  r.,.  T.,.  A-,,  /?_,), 

the  restriction  of  7  by  an  unary  relation  $  is 

Q(  tie  - 7,  S)  =  (ne.  <K.  »v  T,,  A,,  /?,  A  5). 

3.3.5  Set  Construction 

For  a  C’-class  7 

7  =  <l\.  Ty.  A-y,  /?-,), 

the  set  of  7  is  defined  as  follows.  This  definition  is  an  generalization  of  the  example  discussed 
for  Set  .of  Jutegev  before.  The  relation  symbols  T  and  In  have  tlu  same  meaning  as  in  the 
example  of  SetJOf  Jntegcr. 

Sct[ilg(>t'  1  )  —  ttt^sVf.  $ .  I- •  d  S'  [ '  ^  Sc  ^  ( "y ) )  * 

where 

/Cet(-vjU')  s  (Vy  in(y.  .»•  -  a;  . 

The  structural  sentences  of  Set(nc.  .  ;  u.re  just  the  theory  Tsc!  of  set  foT  any  C-class  7. 
The  auxiliary  sen' cnees  Ase<  may  be  defined  arbitrarily  to  meet  the  appropriate  description 
of  C-classes.  Although  Rset  say?  nothing  about  the  cardinality  of  the  set,  we  assume  that  the 
cardinality  is  finite.  More  precisely,  we  only  curAde  .  .me  sets  as  t lie  model  of  t lie  set  (  -class 
Set{nset<l)-  Combining  Set  operation  with  restriction  operation,  we  get  a  more  general  set  of 
C’-classes.  More  specifically,  subsets  of  the  set  Set(v$et,  7)  of  a  C-class  7  will  be  expressed  by 
applying  a  restriction  operator  to  Set(nset,  7). 

3.3.6  Categorization 

Once  we  get  the  notion  of  the  set  construction  of  a  C-class,  we  can  categorize  t he  elements  of 
the  set  by  concerning  some  attributes,  in  the  categorization,  we  ignore  t lie  other  attributes 
that  are  not  interested.  We  obtain  a  set  of  set  of  a  concept  by  taking  a  categorization.  We 
define  the  categonzation  operator  as  follows.  Let  7  be  <t  C-class,  and  let  the  interested  attributes 
$  be  a  subset  of  the  attributes  <{>-,. 

7  =  (n. y,  <!>-,,  tv  Ty,  A.,.  R-y),  T  C  «!>.,. 

The  categorization  Ofun-T-'f)  of  the  ('-class  7  with  respect  ^  is: 

fiOto-7.  W)  =  ("a.  0-  1.  T,r(.An.  Rq)- 

where 

Rui  ./•)  =  (V/yV/iVc Inf  y,  x)  A  In(  u.  y)  A  Ini  r.  //) 

=>  7'(».H,)A7’(r.HOA(A/£*(/(ii)  =  /(Ci)  )■ 


10 We  assume  that  (tie  Iree  variable  of  these  formula-"  are  the  same 


3.3.7  Generated  C-Classes 


\\'e  can  consider  the  closure  by  the  fundamental  operators  on  C-classes  in  the  same  manner 
as  data  algebras.  The  universal  family  of  C-classes  is  the  set  of  C-classes  that  is  Hosed  under 
fundamental  operators.  And  the  universal  closure  of  C-classes  is  the  minimum  universal  family 
that  contains  the  C-classes. 


3.4  Hierarchy  of  C-Classes 

To  formalize  the  hierarchy  of  concepts,  we  introduce  a  partial  order  among  C-classes.  We 
assume  that  C-classes  are  described  in  a  universal  language.  If  concepts  are  precisely  expressed 
in  the  real  conceptual  world,  we  can  express  the  hierarchy  of  concepts  by  referring  to  only 
attributes.  Namely  a  concept  has  more  attributes  than  its  superconcept.  Thus  we  can  express 
the  conceptual  hierarchy  bv  inclusion  of  attributes.  Roughly  speaking,  we  can  formalize  it  as 
follows.  Let  c.c'  be  concepts,  and  let  the  attributes  <!>,-,  <$,-/  be  the  attributes  of  c.  <•'  i espectivel.v. 
Then  <•  is  the  subconcept  of  c'  if  and  only  if 

Sc  D  4»C(. 

However  as  we  discuss  in  Chapter  4,  our  conceptualization  is  incomplete.  Hence  we  cannot 
specify  the  hierarchy  only  by  its  attributes.  We  need  to  specify  the  hierarchy  explicitly  by 
introducing  an  order  in  the  concepts.  So  we  introduce  an  artificial  partial  order  -<n  on  the 
names  of  C-classes.  Let  trj.nj  be  the  name  of  C-classes  71,72  respectively.  We  say  nt  is  a 
subname  of  112  if 

nj  <n  «2. 

We  assume  that  the  type  matching  predicate  T  that  is  introduced  in  Section  3.3.5  satisfies  the 
following  condition 

VniVn2  n  1  A  n2  =>  (Vz  T(x,ni)  =>  T(  r .  )  ) 

We  include  above  sentence  as  a  part  of  our  theory.  With  this  name  hierarchy,  we  introduce  a 
hierarchy  among  C-classes. 

Let  7-.  7 2  be  C-classes, 


7,  =  (n,.  <f>,,  vt,  T„  A,,  R,)  {1  =  1,2). 


Then  71  is  a  subclass  of  72 
if  the  following  holds. 


7i  A  7 2, 


n I  <n  v 2,  T\  1=  r2, 

*2  c  V/  €  4>2,  c(wi(/))  A  r(r2(f)). 

|=  Vz  /fThl,<j2)(z)  =>  R-i(x), 

where  Hr,(/))  designates  the  C-class  with  name  r,(/)(/  =  1.2).  and 

=  3j/7’( y,  »,)  A  /?,(y)  A  f\  (g(y)  =  g(r)). 

:ie*i 

Since  each  C-class  has  a  unique  name,  we  could  have  defined  the  hierarchy  only  by  the 
name  hierarchv.  However,  as  we  discussed  above,  the  name  hierarchy  is  a  compromise  for  our 
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incomplete  conceptualization.  Therefore  it  is  natural  to  reflect  the  effect  of  attributes  in  the 
definition  of  C-class  hierarchy  as  much  as  possible.  Thus  the  attributes  of  C-classes  play  the 
major  role  in  determining  the  hierarchy  of  C-classes. 

We  should  note  that  we  can  have  the  most  general  C-class  in  the  following  way.  First  we 
assume  that  there  is  the  greatest  element,  say  top.  in  the  name  hierarchy.  Then,  the  most 
general  C-class  ~,j  is: 

It  =  ( /o/j, 0, 1,0,0.  TRUE). 

We  assume  that  the  theory  T-  of  equality  is  always  implicitly  included  in  the  structural 
sentences  for  any  C-class  7. 

T~  -  {Vx  x  —  x}  U  {VxVy  x  =  y  =>  y  =  x} 

U{Vx  Vy  Vc  (x  =  y  A  y  =  z)  =>  x  =  c}. 

Thus  if  we  express  T-,  to  be  empty,  it  means  the  structure  is  specified  only  by  7  =  .  Namely,  it 
is  just  the  structure  of  a  set. 

3.5  Conceptual  Order  and  Fundamental  Operators 

The  conceptual  order  is  the  realization  of  semantic  hierarchy  of  concepts.  There  is  a  close 
relation  between  conceptual  order  and  the  fundamental  operators,  as  shown  in  the  following 
theorem. 

Theorem  2  Let  7,7',  7  -  {7,}”=,  and  7'  =  {7,'}"_,  be  C-classes.  Moreover,  let  n  and  n'  be 
new  C-class  names  such  that  n  <  n' . 

•  Aggregation 

For  attribute  names  <F, 

( 1  <  Vt  <  n, 7,'  <  7'  )  =>(!!(  n.  7.  <F )  ;<  II(  n' ,  7',  <1>)  ). 

•  Abstraction 

For  a  subset  <£  of  the  attributes  of  7, 

n-y  ■<  11  =>  7  <  T  ( n .  7 .  ) 

•  Restriction 

f  or  a  unary  predicate  S , 

n  <  ny  =>  Q(n.~f,S)<  7 

•  Set  Construction 

7  <  ■)'  =>  Set(n.~j )  <  Set(n  .7') 

•  Categorization 

If  the  set  of  attributes  'F  is  common  in  7  and  7'.  then 

7  ■<  =t-  Q{  v .  7,  'F  )  ■<  'Ain'.','.  *F  ). 

The  proof  of  the  theorem  is  easy,  so  it  i.^  omitted. 
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3.6  Generalization  and  Specialization 

In  our  mental  processes,  we  generalize  several  concepts  by  taking  the  common  attributes  of 
those  concepts.  For  example,  we  get  concept  "mammal’  by  generalizing  ‘dog’,  ‘cat’,  "monkev'. 
etc.  On  the  other  hand,  we  specify  a  concept  as  the  semantic  intersection  of  several  concepts. 
For  example,  the  natural  number  is  described  by  the  semantic  intersection  of  integer  and 
positive  number.  We  formalize  these  mental  processes  using  the  conceptual  hierarchy  provided 
above. 

Let  us  assume  that  a  conceptual  hierarchy  <  is  given.  First  we  introduce  some  notations. 
Let  {7,  }J*=1  be  a  set  of  C-dasses.  If  the  least  upper  bound  of  {7,  }”=1  with  respect  <  exists, 
we  denote  it  bv 


If  n  =  2,  we  denote  it  by 


V  7.. 


7i  v  72- 


Dually,  the  greatest  lower  bound  of  {7,  }jl=1  is  denoted  bv 

71 

A  7„ 


7i  A  72- 

By  definition,  the  operator  V  and  A  are  commutative  and  associative.  Furthermore, 

n 

V  7.  =  (7i  V  (72  V  (•  •  -(7n-  1  V  7„)  •  •  •), 


A  7.  =  (71  A  (72  A  (•  • -(7„_  r  A  7n)  •  •  •)• 

1=  1 

Now  we  define  the  generalization  and  specialization. 

The  generalization  of  {7,  is  defined  by  the  least  upper  bound  Vtn= ,  7,.  In  particular 
the  generalization  of  two  C-classes  7  and  7'  is  7  V  7'.  As  stated  above,  any  generalization 
is  described  by  the  operator  V.  We  call  V  the  generalization  operator.  The  definition  of  the 
specialization  is  similar  to  that  of  the  generalization.  We  replace  V  and  “least  upper  bound" 
in  the  definition  of  generalization  by  A  and  "greatest  lower  bound”  respectively.  We  call  the 
operator  A  the  specialization  operator. 

Similarly,  we  introduce  operators  V,  A  in  the  C-class  names,  according  to  the  name  hier¬ 
archy. 

Due  to  theorem  2,  we  have  the  following  theorem. 

Theorem  3  Let  7,7',  7  =  { 7 , } t , 7 7  —  { 7 ' } [L (  be  C-classes. 

•  Aggregation 

Let  7  A  7 '  be  the  sequence  ( 7i  A  7J,  ....  7,,  A  7'J,  and  let  7  V  7'  be  (7]  V  7 J  ...  7,,  V  7'J. 

For  a  new  C-rlass  name  n.n'.n",  a  sequence  of  attribute  names  <t>. 

11  A  n'  =  n"  =>  Il(  7i.  7  d> )  A  II ( ??' ,  7'.  4>)  =  11(  n" .  7  A  7'.  <t>). 

U  V  Tl'  =  n"  =4-  !I(  71, 7.  $  )  V  II  (  7l'.  7',  <I>)  <  11  (  71 ".  7  V  7'.  ). 
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•  Abstraction 

For  a  common  subset  <£  of  attributes  of  7  ,7' , 

nf\n'  —  n"  =>  T(n,7,  <f>)  A  <h)  =  T(n".~f  A  <J>), 

n  V  n'  —  n"  =>  Y(rc,  7,  $)  V  Y(  n\  7',  <f>)  =  T(  n" ,  7  V  */.  $ ). 

•  Re  s(  riction 

For  a  unary  predicate  5,5', 

rc  A  n'  =  n"  =>  0(7,  5  A  5')  =  0(7,  5)  A  0( 7.  5'), 
nVn’—n"  =>  ©(7,  5  V  5')  =  ©(7.  S)  V  ©(7,  S'). 

•  Set  Construction 

nSn'  —  n"  =>  Set(n,  7 )  A  Set(n'.  7')  =  Set(n",  7  A  7'), 

11  V  n'  =  n"  =>  Set(n,  7)  V  Set(n',  7')  X  Set(n'\  7  V  7'). 

We  should  note  that  in  the  previous  two  theorems,  we  always  have  to  specify  the  name  hierarchy 
to  obtain  a  reasonable  result.  The  name  hierarchy  is  an  artificial  hierarchy  and  we  have  to 
assign  the  order  in  the  names  of  C-classes  so  that  they  are  compatible  to  the  natural  semantic 
hierarchy  of  concepts. 

To  summarize,  we  have  introduce  the  notion  of  C-class  and  an  order  among  them  to 
formalize  concepts  and  the  semantic  hierarchy  of  concepts.  Moreover  we  have  introduced 
formal  operators  on  C-classes  that  provides  a  formalism  of  mental  processes  that  produce  new 
concepts  out  of  existing  concepts.  Finally,  we  have  provided  some  theorems  to  show  that  the 
formalism  provides  the  natural  relation  between  the  fundamental  operators  and  the  concept 
hierarchy,  which  is  one  of  the  verifications  of  the  correctness  of  the  formalism. 

4  Models  and  Instances 

So  far.  we  have  discussed  the  notion  of  C-classes,  which  is  the  formalization  of  database  schema 
objects.  Now,  we  are  going  to  discuss  the  actual  data  that  will  be  in  a  database.  We  regard 
a  database  as  an  expression  of  the  real  world.  Each  concept  in  the  real  world  is  expressed  by 
C-class  defined  in  the  previous  chapter.  Each  occurrence  of  concept  is  expressed  as  an  instance 
of  C-class. 

In  the  framework  of  a  value-oriented  modei,  an  instance  of  a  C-class  is  just  an  element  of 
the  data  algebra  that  is  the  model  of  the  C-class.  The  occurrence  of  a  compound  C-class  is 
determined  by  the  set  of  attribute  values.  However,  as  we  discussed  in  Section  1.1,  we  cannot 
capture  the  real  existence  of  the  occurrence  in  this  paradigm,  because  our  conceptualization  is 
always  incomplete,  i.e.,  an  approximation  of  the  real  concept.  We  need  something  other  than 
attribute  values  to  distinguish  the  occurrences  in  the  real  world.  It  is  so-called  object-identity. 
which  will  be  formalized. 

In  this  chapter,  we  first  define  the  value-oriented  model  of  C-classes.  A  value  oriented 
model  of  C-classes  is  a  collection  of  data  algebras  that  are  specified  bv  the  C-classes.  The 
data  algebra  provides  the  space  where  the  structure  of  the  real  world  objects  are  expressed. 
Next,  we  will  extend  the  value-oriented  mode!  to  object-oriented  model  by  introducing  the 
object- identity  space. 
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4.1  Value-Oriented  Model  of  C-Classes 

Let  T  be  a  set  of  C-classes  generated  by  fundamental  operators  from  a  set  r0  of  the  primitive 
<  '-classes,  and  let 

D  =  ({<$-,  |  =  (n-,.  A-,,  r^),  7  €  T } ,  <n  ) 

be  the  pair  of  a  many-sorted  data  algebra  with  the  sort  S  generated  by  L0.  and  the  name- 
hierarchy  <n  of  data  algebras.  Then  D  is  called  a  value-oriented  model  of  T.  if  the  following 
conditions  are  satisfied.  Let  7  be  an  element  of  L  such  that: 

7  =  (n-,.  <K.  l\-  A-,,  /?-,). 


•  Primitive  C’- Classes 

Each  primitive  C-class  7  satisfies: 

-  The  universal  algebra  Ay  is  the  algebra  corresponding  to  a  sort  in  S. 

-  The  restriction  function  of  <C,  is  the  interpretation  of  /C,.  We  assume  that  each 
predicate  will  be  interpreted  as  a  function  to  2.  where  1  is  regarded  to  be  TRUE. 

•  Compound  C- Classes 

For  any  compound  C-class  7,  A-,  is  a  subalgebra  of  FT At.(/j .  Typically,  when  T-,  is 
equal  to  T°.  A-,  is  isomorphic  to  the  product  algebra  A„(/)  itself. 

-  Each  function  symbol  /  in  is  interpreted  as  the  projection  from  fl/e*,  Av(f)  to 

Av(J)- 

-  The  restriction  function  is  also  the  interpretation  of  R.-,. 

For  a  C-class  7  corresponding  a  concept,  an  element  of  the  data  algebra  <5^  represents  an 
occunence  of  the  concept  as  a  value.  We  call  the  element  a  value  instance  of  7.  Furthermore, 
the  data  algebras  should  be  compatible  with  the  hierarchy  of  C-classes.  Namely, 

V7V7'  £  F,  7  d:  l'  =>  :  6*,  — ►  ^y,(Pa,v  is  the  subtype  mapping  from  7  to  7'h 

For  the  top  C-class,  we  have  a  model  6j  that  is  set  theoretically  isomorphic  to  the  set  of 
object-identities,  which  will  be  formally  introduced  in  the  next  section. 

=  ((fi.0),  1). 

4.2  Object-Oriented  Model  of  C-Classes 

The  value-oriented  model  of  a  C-class  provides  the  base  of  the  algebraic  structure  for  expressing 
occurrences  of  concepts.  In  this  section,  we  extend  the  value-oriented  model  by  the  notion  of 
object-identity.  We  will  introduce  object-identity  space  to  express  the  real  existence  of  objects. 

Let  D  be  a  value-oriented  model  of  T  as  defined  in  the  previous  section.  Let  ft  be  the  pair 
ol  a  set  fl  with  an  appropriate  cardinality,  a  collection  F  of  partial  functions  from  f \  to  itself. 
We  call  ft  the  object-identity  space.  Further,  let  /  be  a  collection  of  partial  functions  from  11 
to  a  data  algebras  in  D  for  each  7  in  F.  Namely, 


D  =  {M7€  I  }. 

/  =  K|i,:  0  ~  -■  €  F}. 


The  partial  function  is  called  the  instance  mapping  of  7.  The  domain  d(-y)  of  1-  is  called 
the  abject  instances  of  7. 

Then,  an  object-oriented  model  .\4(F)  of  C-classes  T  is  a  triplet 

M(I')  =  (D.ft,/), 

which  satisfies  the  following  conditions. 

•  Let  7,  7'  be  in  1.  If  7  ■<  then 


<)(  7)  C  and  VC;  6  r)(7)tfy(w)  =  p^yoz^fw')- 

where  /)->y  is  the  subtype  mapping  from  7  to  7'  in  the  value-oriented  model  D.  Thi- 
condition  shows  the  compatibility  of  the  hierarchies  of  the  object-identity  space  and  the 
value-oriented  model  D.  Note  that  the  hierarchy  of  C-classes  in  the  object-identity  space 
is  expressed  by  the  set  inclusion  of  the  domains  of  instance  mappings. 

•  For  each  function  symbol  /  that  appears  in  the  description  of  C-classes,  there  is  a  corre¬ 
sponding  partial  function  o(f)  in  T . 

•  The  mappings  o(f)' s  are  related  to  the  value-oriented  interpretations  via  the  map¬ 

pings  /  in  the  following  way.  Let  us  take  a  function  symbol  / that  appears  in  the  descrip¬ 
tion  of  C-classes.  which  has  a  signature"  n\n2  ■  ■  -n,a  —  rz,  where  n  and  n,'s  are  concept 
names  of  C-classes  7,  7,'s.  Then  we  have  the  commutative  equation: 

;  o  o{f)  =  v(f)  o  tt"=1  z„ 

C=i  b 

A 

»(f) 


where 

Hf):  Ii;“=i<\  O(f):  nn  -  /?, 

t;'1  1  is  the  product  mapping  of  < ,  ( z  =  1,...  n): 

tr;'=1t,((-, . -*nl)  (MU-l) . 'nUjl- 

1  he  data  alimbra  t>  rorrespoiuls  to  the  ('-class  7,  and  1  is  the  instance  mapping  of 
similarly  data  algebra  <\  and  instance  mapping  1,  for  7,  (1  <  1  <  n). 

I  he  definition  of  >i"iiaiiire  is  provided  m  [(IB  Sa] 
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The  above  commutativity  is  the  essence  of  our  model.  It  clearly  separates  the  “object- 
oriented  part”  and  “value-oriented  part".  We  call  it  fundamental  commutativity.  Further,  it 
demonstrates  the  essential  difference  of  an  object-oriented  data  model  and  a  value-oriented 
data  model.  The  difference  between  object-oriented  model  and  value-oriented  model  lies  in 
the  object  identity.  There  are  several  features  other  than  object-identity,  which  are  generally 
considered  to  characterize  an  object-oriented  model,  such  as  complex  object,  inheritance,  etc. 
However,  as  we  will  see  later,  the  semantics  of  those  features  can  be  captured  by  the  algebraic 
construct,  such  as  types,  aggregation  operators,  when  we  express  instances  as  elements! values) 
of  a  data  algebra. 

The  set  of  instance  mappings  K  j  7  G  F}  is  called  an  schema  instance  of  F. 

The  object-identity  space  ft  is  a  flat11  set  with  a  set  of  partial  functions.  The  value-oriented 
model  D  provides  a  structure  on  ft,  which  is  called  valve  space  of  T.  The  instance  mapping 
of  a  C-class  expresses  the  correspondence  between  object  instances  and  value  instances. 

We  have  a  natural  ordering  for  schema  instances.  Let  the  object-identity  space  ft  and 
value-oriented  model  D  be  fixed,  and  let  I  and  /'  be  schema  instances  of  F: 

/  =  Kb  €  F}.  /'  =  Kb.  G  r}. 

We  call  the  schema  instance  /  the  schema  subinstanrr  of  /'  and  denote  it  by 

/  ^  /', 


if 

V7  G  F,  1 'y  is  an  extension  of  iy. 

This  ordering  is  useful  when  we  consider  the  schema  instances  of  C-classes  with  recursive 
structure.  Obviously,  the  order  is  a  partial  order.  If  /  and  /'  coincide  on  the  intersection  of 
t  heir  domains, 

V7  G  r,  V.r  G  t ?K)  n  ifl.r)  -  iflx). 

we  call  them  compatible.  It  is  easy  to  prove  that  any  set  of  compatible  schema  instances  has 
the  least  upper  bound  with  respect  to  the  above  order. 

4.3  Induced  Mapping  on  Instances 

In  this  section,  we  discuss  how  the  fundamental  operators  on  C-classes  are  interpreted  in  the 
object-oriented  model. 

The  induced  fundamental  operators  dire  the  mappings  that  transform  instance  mappings  to 
other  instance  mappings.  Forgiven  C-classes,  we  can  create  new  C-classes  using  fundamental 
operators.  Accordingly,  for  the  created  C-classes,  we  can  create  instance  mappings  out  of 
instance  mapping  of  original  C-classes.  In  this  section,  by  the  term  "instance  mapping",  we 
mean  a  partial  function  from  object-identity  space  to  a  data  algebra,  which  may  provide  an 
object-oriented  model.  As  discussed  later,  the  induced  instance  mapping  will  not  provide  an 
object-oriented  model  for  a  certain  kind  of  fundamental  operators. 

Lot  us  assume  that  an  object-oriented  model  ,Vf(F)  of  F  is  given: 

MID  =  (D.ft./)./  =  ft  —  Ml  G  1  }. 


[iy  the  ti-rm  /hit.  we  mean  that  no  element  of  the  set  has  a  substructure 


P‘i<j(  i  ! 


We  assume  D  and  ft  are  fixed.  As  defined  before,  an  instance  mapping  i ^  is  a  partial  function 
from  the  object-identity  space  ft  to  the  data  algebra  <*V  We  denote  the  domain  of  an  instance 
mapping  i~,  by  d{i-,).  Let  0  be  the  instance  mapping  of  7  in  /, 

7  =  ( n-r,  <J\-  t'-,-  T-,.  S-,.  /2-v ) 

•  Restriction 

Let  5  be  a  unary  predicate  that  is  intended  to  impose  a  restriction  on  7.  The  induced 
restriction  operator  0(-,  S)  is  defined  as: 

5(0(0.. S')/!/  {w  €  0(n)|i/(S)(i,M)  =  1  } 

V«t5(0(td).  0(0-  S){~)  d=  0M. 

Intuitively,  the  induced  restriction  operator  takes  only  instances  that  satisfy  the  predicate 
5.  Note  that  the  predicate  symbol  S  is  interpreted  as  a  mapping  from  6~,  to  2. 

•  Abstraction 

Let  be  a  subset  of  <I>.  The  induced  abstraction  operator  Y(-.  ty)  is  defined  by: 

d(f(»,.4'))=/  5(0), 

V^€5(T(0)),  T(0,#)(u>)'W  P*o0M, 
where  is  the  projection  from  to  IIs6\j>As. 

•  Aggregation 

The  induced  operator  for  aggregation  is  different  from  the  above  operators,  because  it 
is  a  constructive  operator.  Let  7,  be  a  C-class  and  let  ix  be  the  instance  mapping  for  7, 

( i  =  1 . n).  Then  the  induced  aggregation  operator  IT( ■ )  is  defined  as  follows.  The 

domain  5(II(  (ti  . . .  in) ))  of  the  induced  mapping  is  a  new-  subset  of  fi  that  has  one  to  one 
correspondence  to  IdjA,  d(i,)  with  a  mapping  e: 

5(fi((ii . «„))=n?=1d(ii). 

Then  the  induced  instance  mapping  is  defined  by: 

n((ii,...,0))d=/  (II^Ooc. 

There  is  a  certain  technical  details,  about  the  aggregation  operator.  If  we  have  already 
an  instance  mapping  1  for  the  aggregated  C-class,  we  impose  a  condition  to  the  invention 
of  object  identities  so  that  the  newly  derived  instance  mapping  is  an  extension  of  the 
existing  one. 

•  Recursive  Aggregation 

The  induced  operator  for  a  recursive  aggregation  is  obtained  by  inductive  limit  of  gen 
crated  instances.  More  precisely,  we  first  define  an  inflationai  operator  to  produce  new 
instances.  Then  we  take  the  limit  of  successive  applications  of  the  operator. 

Let  G.  C,  C,  IT  and  <I>  be  the  same  as  in  Section  3.3.2.  Let  T  be  the  set  of  (’-classes 
corresponding  V  ,  and  let  D  be  the  object-oriented  model  of  T, 


r  =  {7u|«€  V}.D=  {M«e  v). 


I'uijf  :i 


Further,  let  1  be  the  collection  of  all  schema  instances  of  F.  We  define  an  operator  dir 
from  the  I  to  I.  Let  /  be  in  I, 


i  =  {iu-.n-K\ue  v). 


Then 

Cvr(  / )  d-  /V£, v(I) 

where  V  designates  the  least  upper  bound  with  respect  to  the  schema  instance  ordering 
defined  at  the  end  of  section  4.2.  The  instance  mappings  £u  (7)  is  defined  as  the  minimal 
schema  instance  that  is  compatible  with  I  such  that  it  satisfies  the  following  conditions. 

Vr  £  IF,  6r(/)u  =  iv, 

Vi-  £  r  -  IF.  VtiV<7  s.t.  (t\  u,g)  £  £,  Kg(ss({u-{I)V))  2  ^KF 

where  of £n  (/),))  and  o( /,,)  are  codomain  of  £h  (/)c  and  3(<u )  respectively,  and  is  the 
projection  corresponding  the  edge  ( v,u.g ).  Note  that  £ir(7)  may  not  be  unique  13.  For  a 
given  schema  instance  /.  we  construct  a  monotone  increasing  schema  instance  sequence 
{ /„  JdLu  by  applying  dw  successively. 

IQ  =  I,  /n+1  deJ  (tv(/„). 

Since  { /., } ,7—o  forms  a  compatible  set  of  schema  instances,  we  can  obtain  the  inductive 
limit  Iy  as  the  least  upper  bound  of  the  set.  Then  we  define  the  induced  schema  instance 
ft  ((7.  <f>.  IF)  as  /ooi 

y  -  {t'l-  V2 . t'm},  loo  =  {ii&  —  K  I  1  <  i  <  m}. 

n(G,$,w')  =  n((*, . *m)). 

By  definition,  the  instance  mappings  for  the  C-classes  in  LF  will  not  change  with  dir-  We 
call  II  the  set  of  stable  C-classes.  If  W  is  equal  to  V'.  the  recursive  aggregation  reduces 
to  the  original  aggregation  defined  above. 

•  For  a  s«>t  construction,  we  can  naturally  induce  an  instance  mapping.  The  induced 
instance  mapping  describes  the  instances  with  all  the  possible  finite  sets  of  original  in¬ 
stances  More  precisely,  let  i  be  an  instance  mapping  of  a  C-class  7. 

t :  17  —  H-,. 

Then  induced  mapping  7  by  set  construction  is  a  minimal  instance  mapping  such  that  its 
codomain  includes  all  finite  sets  generated  by  the  codomain  of  t. 

T(  7)  J  { { .1- ! .  x  2 . rn}  !  x,  £  T(t)(  1  <  1  <  n).  n  =  0.  1.2 - }. 

The  induced  mapping  is  not  unique.  If  the  C-class  Set(n.-))  has  non-null  instance  map¬ 
ping  ,  from  the  beginning,  we  construct  the  instance  mapping  7  so  that  7  is  t  lie  extension 
of  . . 

'■'Actually.  £u  o  .1  midi i-v allied  futiri  1011 .  However,  we  consider  it  a-  an  ordinary  function  by  taking  one  of  il 
values.  1  lie  cm,|i-ii,  ,  ,  >f  Mu  can  be  easily  proven  using  the  fundamental  commut  al  ivity 
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•  Categorization 

The  induced  mapping  for  the  categorization  is  obtained  by  the  composition  of  induced 
mappings  of  set  construction,  aggregation,  and  restriction  operators,  according  to  the 
definition  of  the  categorization. 

We  should  notice  that  the  induced  instance  mapping  may  not  be  unique  for  (generalized) 
aggregation,  and  set  construction.  This  is  due  to  the  fact  that  these  operators  require  object- 
identity  invention  [AK  89]. 

Furthermore,  we  can  introduce  operators  on  instances  that  correspond  to  generalization/specializat  ion 
operators. 

Let  y,  be  a  C- class. 


7.  =  *i,  r„  T,.  A,,  /?,)  (/=  1,2), 

and  let  ( TI/^,  A,,/,  r, )  be  the  data  algebra  corresponding  to  7,.  Further,  let  i,  be  an  instance 
mapping  of  7,,  and  let  P,  be  the  projection  from  A,./  to  A,./  (i  =  1.  2).  If 

V/e$,n<I>2  A,t/  =  A  2,]  and  /jo/,  =  P2oi  2  on  d(q )  n  d(i2), 

the  induced  generalization  and  specialization  of  q  and  i2  are  defined  as  follows. 

•  Generalization 

The  induced  genera’ Aation  operator  V  is  defined  as: 

-  if  the  inters  Mon  of  <f>!  and  $2  is  not  empty, 

0{ / 1  v ;_> )  =  0(i\ )  u  d(i2), 

—  dr  J 

Vw-  €  £>(!,-)  (||V»2)(-)  =  fiUM)  (»  =  1.2). 

-  if  the  intersection  of  $1  and  $„>  is  empty,  the  domain  of  2 1 V is  the  same  as  above, 
and 

(»iV*2):  tt  -  <ST  =  ((Q,0).  1) 

jj  1 -  _■  (  inclusion  manning  ). 

•  Specialization 

For  the  specialization  operator  on  (’-classes,  we  have  the  following  induced  specialization 
operator  A.  The  operator  A  is  defined  as: 

0(/,A  12)''=  <H>x)Pd(i2)' 

VA  €  P(  1 1 A  1 j ) ,  ( / 1 A  1  2 )  ( A  f . 

IT,  o(q/u2)  =  1,  (i  =  1,2). 

Although  we  can  derive  new  instances  by  induced  operators,  we  should  note  that  these 
instances  are  just  possible  candidate  instances  in  our  model.  However,  in  intuitive  sense,  if 
a  ('-class  is  derived  by  the  fundamental  operator  other  than  aggregation  or  set  construction, 
the  instance  mapping  should  be  obtained  by  the  induced  operators.  We  should  note  that 
our  object-oriented  model  is  fairly  general.  Hence  we  would  get  a  variety  of  “actual  models 
according  to  the  way  of  providing  instance  mappings.  To  provide  instance  mappings  by  the 
induced  mappings  of  the  fundamental  operators  A  a  canonical  way  of  obtaining  an  object- 
oriented  model. 
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5  Database  Design 

5.1  Entity  C-Classes  and  Abstract  C-Classes 

In  our  object-oriented  model  of  (’-classes,  t hero  can  be  more  than  one  object-identity  corre¬ 
sponding  to  one  element  of  data  algebra.  Hera  use*  our  conceptualization  is  incomplete,  we 
cannot  characterize  the  real  existence  of  objects  by  their  attribute  values.  However,  in  order 
to  provide  a  representation,  we  should  assume  that  the  existence  can  be  described  by  attribute 
values  for  certain  concepts  at  least  in  a  closed  domain  of  the  real  world.  This  is  a  matter  of 
knowledgebase  design. 

Hence  it  is  important  to  analyze'  in  which  case  a  C-class  should  he  characterized  by  its 
attribute  values,  or  more'  generally,  in  which  case  the  object  instances  are  equivalent  to  the 
value  instances.  Namely,  wo  should  consider  when  we  should  require  the  instance  mapping  of 
(’-class  to  be  injective.  In  this  section,  we  consider  two  kinds  of  C-classes  that  the  instance 
mapping  will  be  injective.  One  is  t  he  algebraic  (  -class,  the  other  is  the  logical  C-class.  Further 
we  claim  that  even  the  instance  mapping  of  a  logical  C-class  has  the  inherent  possibility  of  not 
being  injective,  because  our  knowledge  representation  is  always  incomplete. 

First,  we  introduce  and  discuss  the  algebraic  (’-classes.  Let  us  consider  the  concept  string 
for  example.  What  are  the  instances  of  string!  It  depends  on  the  context  how  we  consider 
the  concept.  We  can  say  that  every  string  appearing  in  the  real  world  can  he  an  instance  of 
C-class  String.  Consider  the  following  same  sentences. 

•  “string"  is  an  instance  of  String. 

•  “string”  is  an  instance  of  String. 

The  string  “string”  in  the  first  sentence  is  an  instance  of  String  which  is  different  from  the 
instance  “string”  in  the  second  sentence.  However,  we  often  need  to  abstract  the  real  occur¬ 
rences  of  String  and  regard  the  many  instances  as  a  same  object.  This  is  exactly  what  the 
value-oriented  model  of  C-class  String  is  intended  to  be.  The  universal  algebra  A  string  is  the 
abstraction  of  real  occurrence^  of  strings  with  abstracted  functions  such  as  length ,  concatenate. 

The  algebraic  model  A.v,rm,;  is  virtual  and  doesn't  exist  in  the  real  world.  However,  we  want  to 
treat  the  virtual  model,  such  as  the  algebra  A  string*  as  'fit  existed  in  the  real  world.  In  other 
words,  we  want  to  allow  the  conceptual  existence  of  the  abstract  objects.  So  we  introduce  a 
category  of  (.’-classes  whose  instances  are  virtually  the  same  as  the  domain  of  an  algebra  in 
the  value-oriented  model.  Namely,  the  instance  mapping  is  injective.  We  call  such  (’-classes 
algebraic  C-classes.  An  algebraic  C-class  is  a  kind  of  “literal.” 

Other  than  algebraic  C-classes.  there  is  another  kind  of  C-classes  that  instance  mapping 
should  be  injective.  It  is  the  C-class  derived  from  a  logical  relation.  We  can  express  a  n-ary 
logical  relation  by  a  C-class  with  n  attributes.  Since  an  occurrence  of  logical  relation  is  nothing 
but  an  element  of  a  subset  of  the  cartesian  product  of  domains(object-identities),  it  is  exactly 
characterized  by  its  attribute  values.  We  call  such  (.’-classes  logical  C-classes.  The  notion  of 
logical  ('classes  will  be  discussed  in  detail  with  an  example  later  in  this  section. 

Note  that  the  notion  of  algebraic  (’-clashes  and  logical  ('-classes  are  not  determined  by 
object-oriented  models.  Rather,  it  is  required  in  the  meta  level.  In  other  words,  it  is  a  design 
issue  of  knowledge  representation  whether  we  require  a  C-class  to  be  an  algebraic  or  logical 
('-class.  We  call  a  (’-class  an  abstract  ('-class,  if  we  impose  a  restriction  that  its  instance 
mapping  is  injective. 

The  abstract  (  classes  strict.lv  (it  into  the  value-oriented  data  model.  If  all  the'  C-ilasse-, 
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are  abstract  C-classes,  any  object-oriented  mode!  is  essentially  the  same  as  a  value-oriented 
model. 

We  call  the  remaining  C-ciasses  entity  C-classes.  whose  instance  mappings  are  not  intended 
to  be  injective.  The  entity  C-classes  are  the  representation  of  the  "existing  objects"  in  the  real 
world.  In  a  practical  design  of  knowledgebase,  the  physical  objects  and  events  are  expressed 
as  entity  (’-classes.  This  design  issue  will  be  discussed  in  the  inter  section.  For  example, 
■person’. 'animal  , 'company', 'meeting'  and  'order'  are  entity  C-cla.--.es.  Note  that  the  "existing 
objects”  should  not  necessarily  be  physical  objects  nor  events.  It  can  be  some  abstract  object, 
which  is  still  an  expression  of  the  existence  of  ‘"something”  in  the  real  world.  Basically,  anyt  liing 
that  can  be  noun  will  be  an  entity  C-class.  Hence,  even  ‘friendship',  ‘love’  can  be  entity  C- 
classes.  Actually,  the  author  presumes  that  the  nominalization  in  the  mental  process  of  human 
being  is  essentially  the  same  as  creating  an  entity  C-class.  The  identity  of  an  entity  C-class  is 
characterized  by  its  object-identity. 

We  emphasize  again  that  the  notion  of  abstract  (/-class  and  entity  C-class  is  not  determined 
by  its  model.  The  instance  mapping  of  an  entity  C'-class  may  be  injective  with  some  particular 
object-oriented  model.  It  is  a  meta  level  requirement,  i.e.  design  level  requirement. 

It  is  controversial  whether  we  should  express  a  logical  relation  as  a  C-class.  Alternatively, 
we  can  introduce  the  notion  of  logical  relation  as  another  construct  of  our  theory.  There  are 
two  reasons  why  we  express  logical  relations  as  C-classes. 

•  It  may  be  the  case  that  an  occurrence  of  logical  relation  will  be  converted  to  an  existing 
object  by  a  certain  meta  operation,  which  will  be  discussed  in  the  rest  of  this  section.  So. 
'I  is  more  convenient  to  express  logical  relations  as  C-classes.  because  the  meta  operation 
can  be  expressed  as  just  a  mapping  from  a  C-class  to  another  C-class. 

•  It  is  better  to  have  only  C-classes  as  the  basic  construct  of  the  model  so  that  we  can 
treat  the  knowledge  representation  in  a  simple  and  homogeneous  way. 

In  the  rest  of  this  section,  we  will  provide  the  intensive  consideration  to  the  meaning  of 
entity  (.’-classes  and  logical  C-classes.  Especially,  we  will  discuss  the  meta  operation  that 
converts  a  logical  C-class  to  an  entity  C-class. 

A  logical  C-class  is  a  compound  C-class  that  we  make  up  to  express  a  logical  relation  of 
the  real  world  objects. 

Let  us  consider  a  concept  Person  with  attributes,  name,  loving.  Further,  let  i  be  the 
instance  mapping  of  Person  and  d{i)  be  the  domain  of  t.  The  attribute  value  iovnuj( w) 
designates  the  people  that  u>  loves. 

Person  —  {person,  {name,  loving} ,  cper,OJ1, ,, I'Rl'E) 
vpcrson(name)  =  String.  vpeTson(ioving)  =  Set  j> f  _Pr rson . 

For  example, 

t){i),name.{ u>)  =  "John"  .liame(^j')  =  "  M  nr  ij"  .  In(W .  locinrp^:)  \ 

means  that  the  person  named  "John”  loves  the  person  W  named  "Mary." 

A  (  '-class  \)  will  be  a  logical  (’-class  with  twoattribut.es  ‘love>’  and  ’loved’  pointing  persons: 

F?  =  (a  fj  ect  ion  ,  {/  ore  s.  I  on  </ } .  rn.  It ) . 
vn(lori. s)  —  person .  rol  lon  d)  —  pirsnn. 


where 


I>t:qf  'A 


Ro?(x)  =  In(lort  d(x),loving(loi'cs(.v))). 

and  the  structural  sentence  7q  is  the  one  that  is  similar  to  T!.1  for  a  C-class  -  with  non¬ 
empty  attributes.  It  is  important  that  the  existence  of  the  ‘affection’  is  derived  from  the 
attributes! state)  of  the  persons.  In  this  case,  it  is  derived  from  the  attributes  of  the  loving 
person.  The  restriction  form  /fo  is  not  only  restriction  but  also  the  definition  of  the  C-class 
T.  Namely,  the  existence  of  instance  is  exactly  specified  by  Ro-  Generally,  the  occurrence  of  a 
logical  ('-class  *  is  specified  by  the  the  restriction  predicate  R.f.  Thus  I  he  identity  of  a  logical 
('-class  should  be  determined  completely  by  its  attribute  values.  The  occurrences  of  a  logical 
C-class  should  be  the  same  if  and  only  if  their  attribute  values  are  the  same.  Thus  one  might 
say  that  a  logical  C-class  can  be  dealt  with  by  the  value-oriented  paradigm.  However  it  is  not 
so  simple. 

We  should  notice  that  even  a  logical  C-class  is  an  approximation  of  the  real  world.  In 
the  above  example,  we  specified  the  C-class  affection  with  the  predicate  R®.  If  the  predicate 
completely  specifies  an  "affection  ",  the  attribute  values  will  determine  the  equivalence  of  in¬ 
stances.  However,  it  does  not.  Molin'  loved  ‘Mary’  yesterday,  i.e.  the  predicate  Rn  held  for 
'.John'  yesterday,  but  it  doesn't  hold  today.  Even  in  such  case,  we  can  still  think  “yesterday's 
love  of  John  for  Mary."  The  instance  of  concept  acquired  an  object  identity.  The  reason  is 
that  the  specification  by  the  predicate  R<?  had  lacked  temporal  information.  If  it  had  included 
the  temporal  attribute,  we  could  have  expressed  the  "yesterday’s  love"  only  by  attribute  val¬ 
ues.  Therefore,  due  to  the  incompleteness  of  our  representation,  even  a  logical  C-class  may 
end  up  as  an  entity  C-class.  Hence  we  introduce  a  meta  operation  A  that  converts  a  logical 
C-class  to  an  entity  C-class.  We  call  .V  a  nonivxalization  operator.  The  nominalization  oper¬ 
ator  corresponds  to  the  mental  process  of  putting  a  name  to  a  chunk  of  information  that  we 
acquired. 

As  discussed  above,  every  C-class  may  be  inherently  an  entity  C-class.  However,  in  order 
to  organize  the  knowledge  representation,  we  should  impose  a  condition  that  certain  C-classes 
are  to  be  abstract  C-classes,  as  discussed  in  the  next  section. 

5.2  The  Concept  Model 

In  this  section,  we  introduce  concept  model  for  database  design,  and  discuss  its  semantics. 

5.2.1  Design  Process 

First,  we  discuss  the  design  of  knowledge  representation.  As  we  mentioned  in  the  previous 
section,  even  an  instance  of  logical  relation  would  be  an  instance  with  object-identity.  How¬ 
ever.  when  we  develop  a  knowledge  representation,  we  have  to  assume  some  of  the  ('-classes 
should  be  abstract  C-classes.  For  example,  when  we  register  a  new  instance  of  C-class  in  the 
knowledgebase,  we  have  to  know  whether  the  instance  is  already  stored  or  not.  As  we  dis¬ 
cussed.  we  can  only  believe  l\ux\  we  can  distinguish  the  instances  by  our  representation.  This 
is  a  matter  of  correctness  of  knowledge  representation.  Hence,  when  we  design  a  knowledge 
representation  using  ('-classes,  it  is  the  main  issue  what  C  classes  vw>  should  regard  as  the 
basic  abstract  ('-classes. 

I’lie  design  process  will  consists  ol  the  following  steps. 

I .  Provide  algebraic  ( '-classes,  such  as  I  nt<  rp  r.  S'l-nx/.  .Si  /.  St  t/ut  net .  hurt  her  we  provide 
primitive  functions  and  predicates.  For  example.  { a- .  >.  .  .}  for  hitii/tr.  \>iinnn. 

intersection .  In)  for  Set. 


pai/t 


2.  Choose  real  world  concepts  that  provide  the  basis  of  our  knowledge  representation  and 
express  them  by  C-classes.  We  introduce  as  many  attributes  as  possible  to  tho.se  C- 
classes.  so  that  we  can  assume  that  their  instances  are  fully  specified  by  attribute  values, 
i.e..  the  instance  mapping  is  injective.  We  call  such  C-classes  base  C-classes.  For  exai  > pie. 
a  -oncept  person  would  be  expressed  by  a  base  (.'-class  Real -Person.  We  assign  as  many 
attributes  as  possible  so  that  we  can  distinguish  individual  persons.  (The  concepts, 
such  as  employee.  .- Indent  can  be  expressed  by  C-classes  derived  from  Real _Pe  rsmi  by 
abstraction  operator,  because  we  don't  need  all  the  attributes  of  Real -Person  to  express 
an  employee  or  a  student.) 

We  should  note  that  basic  ('-classes  are  inherently  entity  C-classes,  although  we  regard 
them  as  abstract  C-classes.  In  fact,  when  v\  view  the  knowledge  representation  through 
a  perspective  different  from  the  original  design  or  when  we  add  a  new  ('-class  into  the 
schema,  a  base  C-class  may  become  an  entity  C-class.  In  such  a  case,  we  have  to  mod¬ 
ify  the  schema  by  adding  new  attributes  to  the  base  C-class.  in  order  to  keep  up  our 
requirement  that  the  C-class  should  be  an  abstract  C-class. 

The  guidelines  of  selecting  base  (.'-classes  are  as  follows. 

•  Physical  objects  should  be  base  C-classes.  For  instance,  person,  car.  location,  etc.  So¬ 
cial  organisations,  such  as  company,  may  be  considered  as  physical  objects,  because 
they  consists  of  physical  objects,  such  as  employee,  office,  factory,  etc. 

•  Events  should  be  base  C-classes.  For  instance,  meeting,  accident,  order  form  of 
parts,  etc. 

3.  Analyze  the  relation  of  base  C-classes  and  check  that  every  necessary  logical  relation 
among  base  C-classes  can  be  expressed  by  the  attributes  of  base  C-classes.  We  add 
new  attributes,  if  necessary.  The  point  is  that  all  information  should  be  included  in  the 
attributes  of  base  C-classes.  If  so.  we  can  express  any  information  by  the  C-classes  derived 
by  the  fundamental  operators  from  base  C-classes.  Hence,  the  integrity  constraints  of 
knowledgebase  will  be  completely  described  by  the  restriction  predicates  of  base  C-classes. 
Titus  in  order  to  maintain  the  consistency,  we  only  have  to  maintain  that  of  base  ( '-classes. 
For  example,  when  we  consider  a  C-class  Person  and  a  C-class  Car.  there  may  be  a  logical 
relation  OwnrrCar.  We  express  them  with  attributes  owns  of  Person  and  owner  of 
Car.  The  attribute  owns  designates  the  belongings  of  a  person,  and  the  attribute  otcutr 
designates  the  owner  of  a  car.  Then  we  wiil  express  the  OwnerC'ar  relation  by  a  logical 
C-class  with  attributes  {owner. car},  and  the  restriction  predicate  Rounr  rcilr'- 

OwnerCar  =  {owner car,  (otener,  car} .  oownfrcar.  /ioun?r  4r). 

V  owner  car  [OWTIC.I  )  —  Jlflsem,  townrrear  —  oljJCCt. 

Rownercar(x)  =  In(  id  '( -V  ) .  OWlls(  OW71  Cr(  X  ) ) ), 

where  7'.(J  is  the  same  as  in  section  3.1.3.  The  restriction  predicate  means  that  the  cai 
rar[r)  is  one  of  the  belongings  of  the  person  owner(x). 

It  is  an  important  requirement  that  we  can  construct  every  logical  relation  by  attributes 
and  primitive  functions  and  predicates  of  algebraic  ('  classes  ,4  and  base  ('-classes  P  If 
so.  wo  can  construct  any  logical  relation  through  fundamental  operators  from  ,4  and  P. 
lienee  it  will  allow  us  to  provide  the  semantics  of  those  logical  ('-classes  using  induced 
instance  mappings  We  will  discuss  it  in  the  nm  t  section 


/n/l/f  }() 


4.  We  define  appropriate  “view”  C-rlasses  using  fundamental  operators.  Logical  C-classes 
will  be  defined  by  the  (generalized)  aggregation  operator,  while  entity  C-dasses  will  be 
defined  by  abstractions  and  restriction  operators. 

5.2.2  The  Concept  Model  and  Its  Semantics 

A  concept  model  At  of  a  knowledgebase  is  a  tuple  consisting  of  C-classes  of  three  kinds  together 
with  a  ('-class  hierarchy 

M  = 

.4  is  the  set  of  algebraic  ( '-/  lasses  such  as  Integer .  String,  etc.  B  is  the  set  of  6ri.sC  C-classts. 
P  is  the  set  of  all  derivable  C-classes,  which  can  be  derived  by  a  finite  application  of  the 
fundamental  operators  from  A  U  B.  We  should  note  that  the  union  of  d.  B  and  P  forms  the 
universal  closure  of  the  union  of  A  and  B. 

The  semantics  of  the  model  is  as  follows.  Let  T  be  a  finite  subset  of  the  union  of  A.  B.  P. 
such  that  for  each  C  class  in  L,  the  C-classes  that  are  the  attribute  values  of  -  is  also  in  T: 

7  =  (.-i-v,  tv>.  T-,.  A.,,  /?-,). 

ts(/)€  r. 

'  Ve  call  such  a  set  of  C-classes  closed  set  of  C-classes. 

The  semantics  of  the  concept  model  is  provided  by  an  object-oriented  model  (D.ST/)  of 
the  C-classes  L  with  the  following  conditions  for  /.  Let  be  a  data  algebra  in  D  that  is  the 
model  of  7  in  L. 

•  The  instance  mapping  of  a  C-class  7  in  A  is  injective  and  surjective  partial  function  from 

fl  tO  Ay. 

•  The  instance  mapping  of  a  C-class  7  in  P  is  obtained  by  induced  instance  mapping  of  the 
fundamental  operators  that  define  the  C-class.  For  a  recursive  aggregation,  we  require 
that  the  base  C-classes  are  always  treated  as  stable  C-classes.  We  will  consider  this 
induced  mapping  in  detail  in  the  next  section. 

•  The  instance  mappings  of  base  C-classes  express  the  instances  that  are  existing  in  the  real 
world.  The  instance  mapping  of  a  C-class  7  in  B  would  be  intended  to  be  injective  by  the 
knowledgebase  designer.  However,  we  don't  impose  the  restriction  as  part  of  the  formal 
semantics.  If  the  instance  mapping  happens  to  become  not  being  injective,  the  schema  of 
the  knowledgebase  should  be  altered.  It  is  a  matter  of  maintenance  of  schema.  Note  that 
a  base  C-class  may  be  defined  with  fundamental  operator  from  other  base  C-classes  and 
algebraic  C-classes.  However,  the  instance  mapping  is  not  derived  by  induced  instance 
mapping.  The  instances  will  be  created  by  update  operations  of  the  user. 

As  we  discuss  in  Appendix  A,  one  of  the  characteristics  of  this  mode!  is  homogeneous 
representation  of  query.  There  is  no  distinction  between  those  three  kinds  of  C-classes  for 
users,  so  long  as  query  is  concerned.  A  user  doesn't  have  to  consider  which  C-class  corresponds 
to  the  data  stored  in  the  knowledgebase.  Each  C-class  would  be  .automatically  bound  to  a  set 
of  instances  by  the  system.  The  homogeneity  of  (.'-classes  will  bring  a  clear  semantics  of  view 
update,  which  will  be  discussed  in  Section  A. 2.1. 
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5.2.3  Two  Kinds  of  Predicates 

If  t  he  derived  C-classes  are  recursively  defined,  their  instance  mappings  will  not  be  always 
determined  nor  exist.  In  this  section,  we  consider  this  matter  further. 

First  we  extend  the  graph  we  discussed  in  the  definition  of  generalized  aggregation.  We  al¬ 
low  t  he  labels  of  edges  to  be  operator  expressions  that  express  t  he  other  fundamental  operators. 
For  example. 


Person  =  (person,  {name,  height .  f  at  her  }->'Pers,n,TPerson^.TIiUE), 

f'Person(name)  =  string.  rperso„i  height)  =  Integer. 
vperson(  father)  =  person. 

Tall  Person  =  Q(tallperson.  Person.  Rpauperson). 

P-Tall  Ptrsoni'l' )  =  (  hei(jht(.V  )  >  (>(//.)). 

The  graph  will  be: 

V'  =  {  person,  tall  person,  string,  integer}, 

E  =  {  (person,  string,  name). (pe  rson.  integer,  height). 

(person , person. father). 

(tallperson.person,  Q(tallperson,  •.  R-TaltPerson)} 

We  can  define  a  function  f  from  schema  instances  to  themselves  in  a  similar  way  as  in  section 
4.3.  The  difference  lies  in  deriving  the  new  instance  mapping  of  the  C-classes  \’d  that  are 
derived  by  fundamental  operators  other  than  aggregation. 

Vue  IF,  C(I)V  = 

Vc  e  V  -  IV  -  Vi,  Vw.  <7s.t.(  v.  U.g)e  E.  ~3(3(£w(/)u))  2  3(tu), 

Vr  e  V,i  -  IF,  ( v,  u.expr)  e  E  $ (/),,  =  (the  induced  instance  mapping  by  expr). 

H  I)  =  /  >v  V  r  €  V  “  Vd)' 

\  f(Hv  (otherwise). 

As  shown  later  in  this  section,  this  f  will  produce  a  non-sense  instance  mappings  for  a 
certain  class  of  restriction  operators. 

Next,  we  introduce  a  meta  function  symbol  getinstauces  in  the  language  that  designates 
all  t  he  instances  of  a  C-class.  For  a  C’-class  7  and  its  name  nv  getinstances( n~, )  designates  the 
set  of  object-identities  in  d(z-f).  The  set  getinstances(n^)  can  be  regarded  as  an  instance  of 
.Vrt(n.5c(,  7).  For  example,  we  consider  a  base  C-class  Man  and  a  derived  C-class  Richestman. 


Man  -  (  man,  {name,  wealth. .  . v\ian ,  Ttfon-®,  TREE), 

Richestman  =  (rirln  stinan.  {name,  wealth] .  rplckr,slmun,Tplches,man,®- 

(name)  =  rRlchf slm,ul(  name  )  =  string, 
r.\lin(  ice alth)  -  rRlc.kes,,n,n,(  wealth)  ~  integer, 

R  p,  i„stm„n(x)  =  (Vy  In(g.  getinstanci  si  nian  i)  =>  wtnlth(.r)  >  wralth(g )  ). 


The  getinstances  cause  the  interpretation  of  predicates  to  be  dependent  on  the  instance 
mappings.  Hence,  it  may  not  be  a  consistent  instance  mapping  to  some  C-ciass  definition,  for 
example,  we  can  express  an  inconsistently  defined  C-class: 

Wrong  Number  =  Q{wrongnumber ,  U(nQ0\,(  Integer),  {value)),  Ru'roniXumber), 

t  Wrong N umber  {  dll  UC  )  —  integer  , 

R\Vn>ngi\ umber {■>-')  =  (Vt/  / n( y , gc t instance s{wrongnumber ))  =>  x  ^  y). 

We  should  note  that  this  kind  of  inconsistency  comes  from  semantics  of  inst  ances.  It  is  different 
from  a  relevant  inconsistent  restriction  predicate,  such  as: 

R(x)  =  (P(x)  A  -'P(x)). 

The  operator  £  defined  above  gives  us  the  wrong  answer  in  this  case.  Let  us  assume  /  i> 
the  initial  schema  instance. 


/ integer  '■  ^  -  Z(OntO,  One  to  Olie), 

5(  l integer  )  =  {uq  .  u>2,  ....}, 

I  wrongnumber  =  1- 

where  1  is  the  null  mapping: 

l:  fl  —  Z  (5(1)  =  0). 


Then,  by  definition,  we  have: 


i ;(/: 


integer 


—  I integer •  £(I)wrongnumber  —  1 integer 


S  ( l  )tnteger  —  I  integer ,  C(  I)wrongnumbr  —  I integer  • 
x(C(  f  ))mteger  I integer ,  ^{C{I))iurongnuTnber  1, 
a .;(/))  mleper  —  I  integer  '  k  (  C  (  I  )  )wrongnurnber  1- 


In  general. 


s(  ln)irrongnumber 


l integer  (if  n  is  even) 
1  (if  n  is  odd), 


where  In  designates  C,n(I).  Thus,  we  cannot  have  the  inductive  limit  of  {ln}^=l.  The  problem 
comes  from  the  fact  that  RWrongnumber  depends  on  its  own  instance  mapping.  More  specifically, 
the  variable  ij  is  universally  quantified  on  the  domain  of  the  instance  mapping.  So.  £(/„  ) 
"oscillates”  between  I,nteger  and  1.  The  induced  mapping  of  Wrong N  umber  doesn't  provide 
an  object-oriented  model. 

According  to  this  observation,  we  introduce  a  class  of  predicates. 

First,  we  introduce  the  following  syntax  sugar  to  simplify  the  notation. 


(V.r O) 


def 


V.r(  In(x.  getinstances(n-))  =?  <?), 


(  3x  :  U-,  O )  '•lacier  ride  f  =  3.v{  f  n(x ,  get  instance  s(  n-,))  A  o). 
I  lion  the  ab«>v**  example  i-,  denoted  by: 


/fii'r,.n./.v  ninbr  i(-r)  =  (V?/:  wrongn  II  mbf  r  x  1  y). 
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We  rail  the  expressions  i  :  a  explicitly  typed  variables,  and  V(  3)  x  :  n ^  a  explicitly  typed 

quantifier.  For  any  first  order  formula,  we  can  move  each  explicitly  typed  quantifier  to  left 
side  of  the  expression,  in  the  same  manner  as  ordinary  quantifiers.  For  example. 

Vi : n  (T(i)  =>  3 y:m  P{x,y)) 

becomes 

Vi :  u  3 y:  rn  ~>T(x)  V  P(x.  y). 

We  call  the  first  order  form  a  normally  quantified  form ,  if  each  explicitly  typed  quantifier  is 
placed  at  the  left  side  of  the  expression. 

If  a  first  order  form  has  a  normally  quantified  form  with  only  existential  explicitly  typed 
quantifiers,  we  say  that  it  is  of  type  2.  A  general  first  order  form  is  called  type  1. 

Theorem  4  1}  every  restriction  predicate  is  of  type  2 ,  then  for  each  schema  instance  /.  the 
operator  q  defined  in  this  section  has  a  fix  point  such  that  I  is  a  subinstance  of  /v. . 

We  can  prove  that  the  restriction  operator  is  monotone  increasing  with  respect  to  the  order 
among  instances.  So.  we  can  prove  f  is  monotone  increasing.  Hence,  there  exists  an  inductive 
limit  by  the  fact  mentioned  at  the  end  of  Section  4.2. 

In  this  section,  we  have  introduce  a  formal  semantics  for  the  concept  model.  The  semantics 
is  expressed  by  a  fixed  point  of  (.‘-operator.  The  fixed  point  of  (.'-operator  doesn't  exist  in  some 
case.  We  can  consider  such  a  concept  model  as  inconsistent.  Theorem  -1  shows  that  some  class 
of  concept  model  is  consistent  in  the  sense  that  there  exists  a  fixed  point  of  c,'14. 

6  Expressibility  of  Concept  Model 

In  this  chapter,  we  consider  the  expressibility  of  our  model  by  simulating  other  models. 

6.1  Relational  Model  Semantics 

The  relational  model  can  be  simulated  by  a  concept  model.  Since  we  will  show  that  datalog 
semantics  can  be  simulated  by  a  concept  model  in  the  next  .section,  we  can  derive  this  result 
as  an  easy  corollary.  However,  we  can  prove  it  directly.  In  this  section  we  provide  only  the 
sketch  of  the  proof. 

We  express  relations  as  compound  C-classes.  For  example,  a  relation  Person(  name,  address) 
will  be  expressed  by  a  C’-class: 

Person  =  ( person .  {name,  address) ,  vpeTSOn,Tperson.ty.  TRUE). 
vperson(name)  =  vperson(address)  =  String. 

The  relational  operators  are  simulated  by  induced  operators  of  the  fundamental  operators. 

.selection  - - »  restriction 

projection  - -  composition  of  categorization  and  abstraction 

product  - -  aggregation 

‘’Then*  is  a  iiivtal  rase  that  the  fixed  point  always  exists  If  there  are  no  recursive  aggregation  involved  in  tin- 
definition  of  tlir  derived  ('-classes,  then  the  concept  model  is  consistent,  i  r..  the  (.'-operator  has  a  lived  point  In 
fart,  for  an  initial  schema  instance  I.  <,'(/)  is  the  fixed  point 
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Furthermore  we  have  a  natural  interpretation  for  the  natural  join  operator.  It  is  expressed  by 
the  specialization  operator.  Let  R  and  S  be  relations  and  -yfl  and  ~,s  be  the  corresponding 
C-classes.  Then 

R  tx  5  * - -  *■/?  A  ~;s- 


6.2  Datalog  Semantics 

In  this  section,  we  show  that  the  semantics  of  datalog  can  bo  simulated  by  the  concept  model. 
First  we  discuss  how  to  convert  datalog  rules  to  C-class  definitions.  We  assume  that  algebraic 
C-classes  such  as  Integer.  String  are  provided  from  beginning.  We  introduce  some  terminol¬ 
ogy.  A  simple  rule  is  a  rule  with  the  body  consisting  of  one  literal.  If  a  rule  is  not  simple,  we 
call  it  a  complex  rule.  We  call  predicates  sucli  as  =,  <.  restrictive  predicates  and  literals  such 
as  A'  <  1  restrictive  literals.  We  also  assume  that  all  rules  are  rectified15.  Moreover,  we  assume: 

•  There  is  no  predicate  symbol  that  is  used  with  different  arity.  For  example,  we  don  t 
have  the  rules  such  as: 

p(X,  Y)  X  =  Y. 
p(X)  :-  X  >  0. 


We  convert  rules  into  the  forms  that  will  be  easily  transformed  to  C-class  definitions  in  the 
following  way. 

1.  If  the  predicate  symbols  of  facts  appear  as  the  heads  of  rules,  we  add  new  rules  so  that 
they  never  appear  in  rules.  For  example,  the  rules: 

p(a)  . 

p(X)  :-  q(X). 
will  become 
pi (a) . 

p(X)  :-  pl(X). 

p(X)  :-  q(X)  . 

2.  If  there  is  a  variable  that  is  shared  by  more  than  one  negated  literal,  and  doesn't  appear 
in  positive  literals,  we  rename  the  variable  so  that  it  is  not  shared  by  negated  literals. 
For  example, 

p(X)  :-  -'q(X.Y)  A  -s(X.Y)  .V  t(X). 
will  become 

p(X)  :-  -q(X.Y)  A:  -s(X.Z)  A’  t(X). 

We  convert  the  rules  by  adding  equality  literals  so  that  the  non-restrictive  literals  do  not 
share  any  variable. 

p(X)  :  q(X,Y)  &  Y  =  1. 
will  become 
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p(X)  q(Z,Y)  &  Z  =  X  &  Y  =  1. 

•1.  If  a  negated  non-restrictive  literal  shares  variables  with  restrictive  literals,  we  seperate 
them  by  introducing  "intermediate"  erjualit ies.  For  example. 

p(  X )  :  -  -qfY.Z)  A'  Y  =  X  Ac  Z  =  1. 
will  become 

p(X)  :  -  -q(Y'.Z')  Ac  Y'  -  Y  .1:  Z'  =  Z  fcY  =  X  fe  Z  =  1. 

Wo  call  the  expressions  like  Y'  =  Y.  Z'  =  Z  in  the  above  example  the  mti  rnifdiatt  literals 
and  distinguish  them  from  restrictive  Literals  by  using  the  equality  symbol  =  instead  of 
=  .  So  the  second  rule  in  the  above  example  is  expressed  by. 

p(X)  :  -  ->q(Y'.  Z'l  Ac  Y'  =  Y  Ac  Z'  =  Z  k  Y  =  X  k  Z  =  1. 


For  the  rules  after  the  above  conversion,  we  .assign  (’-classes  as  follows. 

1.  For  each  non-restrictive  literal  symbol,  we  assign  a  C’-clar.s  (taking  the  predicate  symbol 
as  its  name). 

2.  For  each  argument  of  a  non-restrictive  literal,  we  assign  numbered  literal  names  as  the 
attribute  names.  For  example,  a  literal  p(X.Y.Z)  has  attributes,  p\.p'2.p.).  p  1  corresponds 
to  X,  p2  to  Y  and  ]>'■]  to  Z.  Let  us  denote  the  correspondence  by  a.  In  the  above  example. 

o ( X )  =  pi.  a(Y)  =  p'2.  a(Z)  =  p3. 

3.  For  each  variable,  we  assign  a  C-class  name  as  follows.  We  express  the  assignment  by  a 
mapping  r. 

•  If  a  variable  appears  in  a  restrict  ive  literal,  we  assign  the  name  of  an  algebraic  (.‘-class 
according  to  the  literal.  For  example,  if  we  have  A’  =  1.  we  get 

r(  A  )  =  intf  t/er16. 

•  Otherwise,  we  assign  the  most  generic  C-class  name  top: 

7-(X)  =  tup. 

We  should  remember  that  we  assumed  the  existence  of  the  most  generic  (  -class  top 
7r  in  the  C-class  hierarchy. 

■4.  For  each  attribute,  we  assign  a  C-class  name  in  the  the  following  way.  We  determine  the 
values  of  attribute  value  function  vp  for  each  literal  symbol  />.  In  the  above  example. 

rp(p  l)  =  r(X).  i'p(  p2 )  —  t  ( Y  ) ,  rp(p3)=r(  Z). 

”>.  We  convert  bodies  of  rules  to  first  order  forms  with  explintlv  tvped  <piaiit ifirrs.  We 
describe  the  wav  of  con  version  with  examples.  We  express  t  lie  conversion  with  a  mapping 
o . 

1  If  X  is  paired  wtlh  different  t  \  pes|(  '-classes)  f.y  e.|iialll  ies.  we  i>0”h  tie  least  upp.-r  hniuul  of  those  (.'-class,  »  i 
the  variable  X  for  example  if  X  =  F  and  X  =  we  assign  / , in  X 
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•  restrictive  literal 

We  convert  the  variables  as  shown  in  the  following  example. 

(  a{X)(self)=l  (if  X  is  in  the  head  of  the  rule) 

<j(X  =  1 )  =  <  o( X )(xp)  =  1  ( if  X  is  in  a  non- rest rictive  literal  />(...)) 

x  =  1  (otherwise! 

where  M'lf  will  be  the  free  variable  in  the  restriction  predicate  of  the  restriction 
operator.  The  variable  xp  designates  the  instance  of  C-class  p. 

•  intermediate  literal 

Let  X'  =  X  be  an  intermediate  literal,  where  X'  is  in  a  negated  non-restrictive1,  literal 
and  X  is  in  a  restrictive  literal.  The  variable  X  will  be  converted  in  the  same  way 
as  in  the  restrictive  literal.  We  denote  it  by  o(X).  The  variable  X'  is  converted  to 
o(X')(zp)  where  p  is  the  literal  symbol  that  contains  X.  So  X'  =  X  will  be  converted 
to  o(X')(.rp)  =  o(X). 

•  noil-negated  non-restrictive  literal 

We  assume  that  non-restrictive  literals  are  placed  on  the  left  side  of  restrictive  literals 
in  the  bodies  of  rules. 

0\  /;( X } )  =  3  .vv:p. 

•  negated  non-restrictive  literal 

<?(-/>( X.  Y))  =  V. cp:p. 

After  the  above  conversion,  we  add  explicitly  typed  quantifiers  for  the  variables  that 
appear  only  in  the  restrictive  literals. 

•  If  the  variable  X  appears  only  in  a  negated  literal,  we  add  V.r :  r(X). 

•  Otherwise,  we  add  3i:r(X). 

We  arrange  the  existential  quantifiers  left  side  of  the  universal  quantifiers.  Next  we  collect 
the  intermediate  literals  for  each  negated  non-restrictive  literal  and  take  the  disjunction 
of  negation  of  the  literals.  For  example. 

p(X.Y)  :  -  — ><7( W ,  V)  k  s(A.B)  fc  X  =  W  fc  V  =  B  fc  A  =  Y  ,V  B  =  C. 


will  become 

/>( X .  Y )  :  -  -»?( W',V')  k  s(A.B)  &  W'  =  U  fc  V'  =  V  .V  X  =  W  .  V  =  B  &  A  =  Y  k  B  =  C. 

Then  its  body  will  be  transformed  to: 

3c :  top  3  u :  top  3u :  top  3xs :  .s  Viq  :q  ((  ->(  q\  ( xq )  =  it)  V  -<(  r/2(  r,; )  =  r ) )  A 

(p  1(  If)  -  u  A  n’2(  .r, )  =  c  A  >1(  js)  =  p2(*clf)  A  "2(  .r 

Finally,  we  convert  rules  to  C-class  definition-'. 

:  ’  More  precisely,  we  should  say  11011-rest  rictive  and  noii-iniemu’diatc  literal  However  we  us*-  the  term  "non- 
restrictive  literal"  m  tins  sense 
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•  For  rules  with  head  literal  ]A.X\.X> . V„)  with  bodies  B,  (  1  <  /  <  m). 


Pi  -Vi ,  A'  ), . 

-,-V„)  :  -  Bi 

p{  A',.. Vo,.. 

■  • ,  Vn)  :  -  B2 

I>{  -V|..V2... 

-,-Vn)  :  -  Bm 

the  (’  class  ~<P  is  defined  as: 

m 

1P  =  0(/>.  llCC'OOl'.  (pi . pn),  (r(.V,) . r(  ,Vn )).  \J  o(  D, )). 

1=1 

•  For  facts,  we  assign  each  predicate  symbol  of  facts  a  (’-class.  For  example,  for  tie' 
following  facts, 
f(  1  ,"abc"). 
f(  “a". “be” ). 

we  have 

7/  =  \U  f.{fl.  f2). {top, string)). 

We  regard  that  all  the  C'-classes  are  abstract  C-classes.  We  construct  a  concept  model 
with: 

•  Algebraic  C-classes,  such  as  Integer,  String,  are  given. 

•  Base  C-classes  are  those  obtained  from  facts, 

•  The  rest  of  the  C-classes  are  regarded  as  derived  C-classes. 

If  we  provide  the  instance  mappings  for  all  the  base  C-classes  according  to  ground  facts.  w<> 
can  get  the  datalog  semantics  as  the  least  fixed  point  of  (-operator.  If  there  is  no  negated 
subgoal,  (  is  monotone  increasing,  because  the  restriction  predicates  are  type  2.  Thus  (  has 
the  inductive  limit  as  its  fixed  point.  If  we  have  stratification,  we  can  get  the  least  fixed  point 
of  (  by  the  algorithm  described  in  Chapter  3  of  (UL  88], 

6.3  IQL  Semantics 

We  show  that  our  model  can  express  the  semantics  of  Abiteboul  and  Kanellakis'  IQL-modH. 

In  the  following  discussion,  the  meaning  of  notations  is  the  same  as  theirs,  unless  d  N 
explicitly  mentioned.  We  have  the  sets  of  relation  names  R,  class  names  P.  attributes  A.  and 
constants  D,  and  object  identities  O.  A  given  schema  (R,  P,  T)  is  converted  by  introdming 
new  class  names  P'  so  that  each  type  expressions  appearing  concerning  T  is  depth  1.  for  ck  h 
class  name  p  in  P  U  P', 

T(p)  =  D  |  // 1  [.4i:/»i . I „:/>„]  I  {/>'}  i  (/>i  V  p2)  |  (pi  A  p2)- 

where  //,  p\,  pi . pn  are  in  P  U  P'. 

For  example,  if  we  have  type  assignment. 

T(  person  )  =  [naim  :(/  n'-twt  r  mg .  la  '•t\>t  n  ng^.agci  ntrgf  r]. 


I'S 


we  convert  it  as: 

T(  person)  =  [namc.per. <<on  .name,  agc.integer], 
T(  per  son -name)  =  [ fir$t:string,last:string ]. 
Another  example  is  that: 

T(>f  I  .of  .rational)  =  {[deiv.inleger.  num.integer ]} 


will  be  converted  to: 

T(sf  :t  .of  .rational)  =  {rational} , 

T(  rational)  =  [dt  nnnteger.  num:integer}. 

Next  we  change  the  syntax  of  literals  in  Abiteboul-Kanerakis"  paper.  We  convert  each 
literal  expression  to  In(/2Ji).  where  is  of  type  {t 2}.  Furthermore,  for  a  type  assign¬ 

ment  r  for  variables  in  rules,  we  introduce  new  ('-class  names  so  that  the  value  of  r  i-  always 
a  class  name.  For  example,  if  we  have  a  rule: 

p([Ai:X.A2:Y]) -  q(X).r(Y). 

and  type  assignment  for  variables: 

t(  X )  =  [deir.intcger.num:integer]yr(Y)  =  integer . 

we  convert  the  type  assignment  by  introducing  a  class  name  rational  1  and  a  new  type  assign¬ 
ment: 

r(X)  =  rational.  r(Y)  =  integer. 

T(rational)  =  \dcn:integer,num:integer). 

Furthermore,  for  each  type  expression  that  appears  in  a  rule,  we  assign  a  new  class  name, 
which  will  be  also  included  in  P'.  We  introduce  a  new  class  pi 

T(pl)  =  (Ai:r(X),A2:r(Y)]. 

Finally,  we  convert  the  rule  using  new  type  assignment  and  class  names,  together  with  newly 
introduced  variables.  For  example,  the  above  rule  will  be: 

p(Z)  —  q(X),r(Y),At(Z)  =  X.A2(Z)  =  Y. 

r(Z)  =  pl,T(pl)  =  [A1:r(X).A2:r(Y)j. 

YVe  extended  the  syntax  by  interpreting  Aj  and  A2  as  a  function  symbol. 

After  this  conversion,  we  have: 

•  class  names  P  U  P'. 

•  the  extended  type  assignment  T’  for  classes  and  r’  for  variables.  ( Note  that  we  can  assume 
that  each  rule  has  the  disjoint  set  of  variables). 

•  new  rules  with  only  variables  as  the  argument  of  relation  symbols  R  U  {In}. 

Now.  we  create  the  (’  classes  according  to  an  extended  schema  and  modified  mle-  in  the 
following  way.  First,  we  convert  the  schema  into  ('-classes. 
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1.  T(  />)  =  // 

We  replace  each  class  name  p  by  //  in  die  schema  expression  and  rules 

2.  T (p)  =  {//} 

We  use  set  construction. 

Ip  -  Set(prip'). 


2.  Tip)  =  p|  V  ]>2 


•  1.  T(  p)  =  p,  A  p2 


7>  —  7pi  v  7j'2  • 


—  7pi  ^  ~ip2- 


•a.  [A1:p1,....Am:pmj 

We  use  recursive  aggregation  to  define  ~;p’s. 


7, tummy  =  II  ( (lummy.  (7,  <I> ),  (7  =  (l  .  C). 


I'  =  {p€  PuP'lp  appears  in  the  aggregation  expression.}. 

£  =  {(p,p',Afc)|T(p)  =  (Ap-pi . A k:p',.. .].  } 

<[>  is  any  set  of  symbols  that  has  one  to  one  correspondence  with  I". 

Next  we  convert  the  rules  into  C-class  by  the  same  way  as  we  convert  datalog  rules.  The 
only  difference  is  that  we  may  have  a  functional  expression,  such  as  Ad  A'),  as  argument  of 
equality.  We  can  convert  such  an  expression  naturally  to  a  first  order  formula.  In  the  above 
example,  the  rule: 

p(Z)  —  q(X),r(Y).Ai(Z)  =  X.A2(Z)  =  Y. 

would  be  converted  into 


7 P  =  0(p,pl,(3xq:q  3yr :  h\{stlf)  =  ql(xq)  A  A2(se//)  =  vl(iyr))  ). 

For  given  IQ L  program  T(5,  5in,  50Ut ),  we  convert  the  schema  .S'  and  the  rules  in  the  above 
wav  and  get  C-classes.  Then  we  define  a  concept  model  with 

•  The  C-class  7p  for  the  constants  D  is  the  only  algebraic  C-class. 

•  l  he  C-classes  that  correspond  to  the  initial  ground  fact  are  base  C-c!asses.  as  in  the  case 
of  datalog  program. 

•  The  remaining  C-classes  are  derived  C-classes. 

Then  the  programs  inflational  fixed  point  will  be  provided  by  a  fixed  point  of  the  (-operator. 
Note  that  providing  the  instance  of  a  schema  in  the  IQL  model  is  the  same  as  providing  a  set 
of  ground  facts. 


6.4  IRIS  Semantics 

In  this  section,  we  briefly  show  that  most  of  the  semantics  of  IRIS  system  [FS  89]  can  be 
expressed  bv  a  concept  model.  We  provide  only  a  sketch  of  simulating  the  IRIS  semantics  by 
the  concept  model. 

I  p  to  now.  we  assumed  that  the  algebras  that,  appear  in  the  value-oriented  model  C-cIas.ses 
are  partial-valued  algebras.  In  ordered  to  capture  the  semantics  of  IRIS  system,  we  assume 
that  they  are  multi-valued  algebras.  We  need  no  change  in  our  theory,  because  we  can  replace 
the  partial  functions  in  our  discussion  by  multi-valued  functions,  because  the  multi-valued 
functions  and  sets  form  a  category  as  we  suggested  in  Section  2.1. 

We  formalize  the  semantics  of  IRIS  system  without  foreign  functions.  First  we  assign 
algebraic  C-classes  to  its  literals,  such  as  integers,  strings.  Second  we  assign  base  C-classes  to 
its  objects.  Finally,  we  describe  the  functions  by  first  order  sentences  and  add  them  to  the 
auxiliary  sentences  of  C-classes.  Then  the  ob  ject-oriented  model  of  these  C-classes  provides  the 
semantics  of  IRIS  system.  Actually,  the  semantics  is  expressed  exactly  by  the  object-identity 
space  of  the  object-oriented  model. 

7  Future  Work 

There  are  several  issues  for  future  work. 

•  Schema  Evolution 

As  suggested  in  Chapter  5.  object-identity  plays  an  essential  role  of  schema  maintenance. 
It  may  provide  the  formal  guideline  for  schema  evolution.  For  example,  when  a  new 
concept  (schema  object)  is  added  to  schema,  the  existing  concepts  should  be  altered  so 
that  base  concepts  will  stay  being  ai  stract  concepts. 

•  Complex  Values 

We  demonstrated  that  complex  value  has  an  inherent  disadvantage  concerning  main¬ 
tenance  of  consistency  of  a  knowledgebase,  because  it  cannot  incorporate  with  object - 
sharing.  However,  it  has  a  strong  advantage  in  providing  structured  data  that  a  pro¬ 
grammer  can  easily  handle,  as  discussed  in  [LR  89].  Hence  we  should  introduce  the 
formalism  that  can  provide  the  structured  data  without  sacrificing  object-sharing.  The 
author  presumes  that  it  would  Ire  attained  by  introducing  “local  concept.”  Namely,  the 
language  provides  the  construct  for  defining  concepts  that  are  local  to  a  concept.  A 
programmer  can  provide  the  access  method  to  the  local  concepts  so  that  the  instance  of 
local  concept  and  its  attributes  can  be  shared  from  outside.  We  should  note  that  this 
will  bring  no  change  in  the  semantics  of  object-identity.  Any  object-identity  is  inherently 
global,  because  knowledge  is  global.  The  object-identity  of  a  local  concept  is  realized  in 
the  "global”  object-identity  space,  as  well  as  that  of  global  concept.  The  construct  of  the 
local  concepts  will  be  introduced  for  programming  convenience. 

•  Implementation  of  Concept  Model 

Recently,  a  prototype  system  of  Concept  Model  has  implemented  the  model  as  a  language. 
The  prototype  system  is  writ  ten  in  12.000  lines  of  Common  I.isp  code.  The  system  checks 
the  integrity  constraints  automatically.  The  actual  session  performed  on  the  prototype 
system  is  shown  in  Appendix  C. 

There  are  several  technical  issues,  such  as  type  checking  consistency  maintenance  and 
object-binding,  which  will  be  discussed  in  the  n«-\i  leport. 


8  Conclusion 

We  have  presented  a  formalism  that  expresses  the  clear  semantics  of  object-identity  and  the 
essential  distinction  of  the  value-oriented  inode!  and  the  object-oriented  model.  In  order  to 
express  the  value-oriented  semantics,  we  have  introduced  the  notion  of  data  algebras.  The 
semantics  of  object-oriented  model  is  expressed  by  the  combination  of  the  object -identity 
representation  and  the  value-oriented  representation. 

Moreover,  the  formalism  has  incorporated  the  logical  database  model  into  the  object- 
oriented  model  by  expressing  logical  relations  as  classes. 

We  should  emphasize  that  our  model  provides  the  full-advantage  of  object-sharing  using 
object-identities,  when  it  is  applied  to  a  practical  system.  Vet.  it  also  provides  the  structured 
algebraic  semantics. 

The  concept  model  based  on  the  formalism  has  been  proposed,  which  provides  the  formal 
guidelines  on  knowledgebase  design.  The  concept  model  is  an  attempt  to  represent  the  existing 
objects  in  tne  real  world  as  faithfully  as  possible.  Namely,  the  instances  of  base  C-classes 
are  strictly  corresponding  to  the  existing  objects.  Then  the  abstraction  of  those  objects  is 
expressed  by  derived  C-classes.  The  model  provides  a  way  of  expressing  and  maintaining  the 
integrity  constraints  easily. 
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A  Database  Operation 

So  far.  we  have  discussed  the  schema  representation  of  database.  In  t'-T  ,'"■■■  .  we  will 

describe  t he  database  operations,  query  and  update. 

A.l  Query 

The  semant  ics  of  query  is  simple  for  the  concept  model  M. 

M  = 

A  (|tiory  is  basically  to  get  instance  mapping  of  a  concept  7  in  AuL-'U'D.  We  take  tiie  minimal 
closed  set  Y  of  concepts  that  contains  *  in  the  union  of  A .  B  and  P.  Then  we  obtain  a  fixed 
point  of  (  operator  for  F  .As  discussed  ir.  tlie  previous  chapter,  tor  a  certain  concept,  there 
may  not  exist  the  fixed  point. 

A. 2  Update 

Tim  update  is  to  modify  the  object-model  of  concepts,  i.e..  to  modify  the  instance  mappings. 
We  assume  that  the  value-oriented  model  and  object-identity  space  are  fixed.  Further,  we 
assume  that  any  update  is  obtained  by  composing  the  following  three  operations. 

A. 2.1  Insertion 

Basically,  the  insertion  can  be  done  to  base  concepts.  Or  when  we  insert  an  instance  to  a 
derived  concept,  it  should  be  transformed  to  the  insertion  of  a  base  concept.  Thus  we  cannot 
insert  to  a  derived  concept  obtained  by  the  constructive  aggregation.  On  the  other  hand,  we 
can  insert  an  instance  to  a  concept  derived  by  the  restriction  operator.  If  we  allow  "null-valued"' 
attributes,  we  can  insert  an  instance  to  a  concept  derived  by  the  abstraction  operator. 

The  procedure  for  insertion  is  as  follows. 

1.  Create  a  new  object-identity,  say 

1.  Register  the  values  of  attributes,  say  <F.  ofw.  More  specifically,  modify  the  interpretation 
e|/)"s  of  /  in  <F.  If  t he  valuefobject-identity )  doesn't,  exist,  we  neate  and  insert  it 
recursively. 

■5.  Check  the  integrity  constraints.  If  the  constraints  are  not  satisfied,  then  undo  the  oper¬ 
ation.  (Signal  error.) 

A. 2. 2  Deletion 

Theoretically,  we  don't  allow  the  deletion  of  object-identity,  because  objet  t-identity  is  some¬ 
thing  that  expresses  the  real  existing  object.  For  example,  even  if  a  person  dies,  the  fact  of 
the  existence  of  the  person  cannot  be  eliminated  from  our  knowledge.  However,  in  a  practical 
system,  we  may  eliminate  the  object-identity  if  the  object-identity  is  no  longer  referred  to  by 
t  lie  objects  of  our  interest.  This  operation  is  performed  by  a  kind  of  garbage  collection. 


A. 2. 3  Modification 

When  we  modify  an  attribute  value  of  an  instance  a.1,  we  change  the  interpretation  of  the 
function  symbol,  say  /,  that  corresponds  to  the  attribute.  More  specifically,  we  change  the 
value  of  o(f)(ui).  The  modification  should  be  compatible  with  the  value-oriented  model.  If 
the  object-identity  for  the  new  value  of  <;(/)(„  )  is  not  in  the  knowledge  base,  we  create  a  new 
object -identity  with  th<>  same  procedure  lor  insertion. 

A.  2. 4  View  Update 

Since  we  have  a  homogeneous  representation  of  concepts,  we  can  update  the  knowledgebase 
through  derived  concepts,  whenever  it  is  possible.  More  precisely,  if  we  can  specify  a  unique 
object-identity(instance)  to  lie  deleted  or  modified,  then  we  can  delete  the  instance  or  modify 
the  attribute  value  of  the  instance.  When  we  insert  an  ob  ject  through  view  concept,  if  we  can 
verify  the  object  doesn’t  exist  as  an  instance  of  base  concept,  we  can  convert  the  insertion 
operation  to  the  insertion  of  the  object-idem ii  v  to  a  base  concept. 

To  summarize,  if  the  update  can  be  mapped  to  a  unique  update  at  base  concept  level, 
then  it  can  be  performed.  There  is  a  typical  rase  when  update  through  derived  concept  can 
be  done  safely.  If  a  derived  concept  is  denied  from  .4  and  6  only  through  abstractions  and 
restrictions ,  then  the  deletion  and  modification  can  be  mapped  to  a  unique  update  of  the  base 
concept ,  because  the  induced  instance  mapping  of  the  concept  derived  by  abstraction  and 
restriction  has  a  smaller  domain  than  that  of  instance  mapping  of  the  base  concept. 

B  Methods,  Overloading,  Encapsulation 

The  methods  and  encapsulation  can  be  formalized  simply  by  using  functions  with  subtype 
matching.  We  should  note  that  we  don’t  distinguish  the  type  and  class  in  our  model.  A  C- 
cluss  plays  the  role  of  type.  In  other  words,  each  type  will  be  assigned  to  only  one  class.  Since 
we  have  C-class  hierarchy,  there  is  no  semantic  reduction  even  without  the  distinction  of  class 
and  type.  In  this  chapter,  we  use  the  term  tape  instead  of  C-class.  when  we  use  a  C-class  as 
type. 

B. l  Method  by  Function 

All  methods  are  defined  as  a  function  with  strong  type  checking.  A  method  of  a  (’-class  a  is 
defined  by  a  binary  function.  One  argument  type  for  t  he  fund  ion  is  */.  the  other  is  the  type  for 
the  message.  Note  that  we  allow  a  multiple  function  definition  in  the  following  sense.  For  each 
function  name,  we  can  have  the  multiple  definition,  so  long  as  the  tuple  of  the  argument  types 
of  the  function  is  different.  The  tuples  of  t  he  argument  types  are  ordered  by  the  product  order 
derived  from  the  C-class  hierarchy.  Hence,  the  compiler  will  try  to  pick  up  the  most  specific 
function  definition  according  to  the  argument  types.  For  example,  if  we  have  the  expression 

(/•'■  I 

ls.  we  pick  up  the  function  definition  of  /  with  the  minimal  type  tuple  that  matches  the  types 
of  (.i-[  .  .  . ,cn).  We  require  the  minima'  type  tuple  to  he  unique.  In  a  practical  system,  if  there 
exists  more  than  one  minimal  type  tuple,  then  the  compiler  will  signal  an  error. 


IM\Ye  use  a  lisp-like  notation  of  function 


B.2  Overloading 

The  overloading  of  methods  is  naturally  attained,  because  the  most  specific  function  definition 
is  taken  for  a  particular  pair  of  type  and  message. 

B.3  Encapsulation  by  Subtype  Matching 

The  encapsulation  is  realized  by  the  C’-class  hierarchy.  Let  us  assume  that  ('-class  t  is  a  super 
class  of  ‘  2 ■ 

*1  A  "2*  7.  =  (».-  ^1,  f'l.  Tt,  A,,  Rt)  (  I  =  1,  2  ) 

The  attributes  t  hat  are  proper  for  7  j  cannot  be  accessed  from  In  other  words,  i  he  ai  gumenl 
to  the  function  in  <h|  -  4>_>  should  be  an  instance  of  a  subclass  of  yj.  Note  that  we  include  a 
C-cIass  itself  to  its  subclass. 

My  type  casting,  we  can  easily  provide  a  way  to  define  a  method  of  72  that  can  access  'he 
attributes  proper  to  -;1.  For  example,  let  (*  ■  •)  be  the  type  casting  function.  If  a  variable  x 
has  a  type  -!2.  and  is  the  subtype  of  -r>.  then  (*  71  x)  has  type  7).  Then  we  can  define  a 
function  like  in  the  following  example. 

(defuncticn  funl  (x  : 72.  m : 7 m ) 

(f  (*  7 1  x))...), 

where  f  is  the  function  with  argument  type  71. 


B.4  Application  to  Database  Security 

The  encapsulation  can  be  used  for  database  security.  In  this  section,  we  describe  the  rough 
sketch  of  the  idea.  First  a  user  is  provided  with  a  set  of  C-classes  that  he /she  can  access. 
More  specifically,  the  type  names  that  the  user  can  use  for  the  type  declaration  is  restricted  in 
the  access  language.  So.  we  could  say  that  each  user  has  the  different  access  language.  Let  us 
denote  the  set  of  accessible  types  for  a  user  u  by  A(  it).  We  call  it  access  domain.  The  restriction 
of  accessible  (’-classes  is  used  as  follows,  for  example.  When  we  want  to  restrict  a  user  to  access 
only  instances  of  a  (.’-class  that  satisfy  a  certain  condition,  it  can  be  easily  realized  by  allowing 
the  user  to  access  only  to  the  C-class  derived  from  the  C-cIass  by  a  restriction  operator. 

A  user  who  can  access  only  some  higher  level  of  types  is  not  able  to  access  the  attributes 
proper  to  the  subtypes  of  t hem  without  a  type  casting  function.  Hence,  we  can  impose  a 
protection  by  restricting  the  use  of  the  type  casting  function.  The  protection  mechanism  is 
quite  simple.  A  user  is  provided  with  a  set  of  types  that  can  be  used  as  the  destination  type  of 
type  casting  function.  In  the  above  example,  each  user  has  the  restriction  for  t  lie  first  argument 
of  (*•  •)•  Let  'P(u)  be  the  set  of  types  that  a  user  it  is  allowed  to  use  in  type  casting  function. 
Lite  sot  P ( 11)  is  (  ailed  the  access  range  of  a  user  u.  Let  us  call  11  a  supervising  user  of  type  7  if 
P(it )  contains  7.  If  a  user  n  needs  a  method  that  should  access  the  attributes  of  a  type  iliat 
are  not  in  the  across  range  nor  in  the  access  domain,  u  should  ask  a  supervising  user  of  the 
type  for  defining  the  function.  Then  the  defined  function  is  slopped  to  it.  Fach  user  n .  has  the 
set  of  gin  n  fain Imim  J-(tt)  that  lie/she  can  use  other  than  functions  of  his/her  own  definition. 
I  he  shipped  function  is  added  to  J-(u).  therefore,  the  protection  is  completely  characterized 
by  the  triplet  {A{  11).  P(it).  P(u))  of  across  domain,  access  range  and  given  functions.  Wo  rail 
it  across  /irmlegi.  Furthermore,  we  could  introduce  a  relevant  order  to  designate  the  strength 
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nt  access  privilege,  Fot  us  denote  the  set  of  all  access  privileges  by  P.  Let  n.  J  be  in  P. 

ft  —  lFa),  0  =  (Aa,'P;i-  f;})- 

I  he  access  privilege  ct  is  stronger  than  3.  if 

M(>  2  -cfd  and  Va  3  Vb  and  D  T,-,- 

Moreover,  we  can  extend  the  notion  of  access  privilege  by  assigning  protection  with  each 
ot  database  operations,  such  as  read  and  write,  insert  and  delete.  Let  C.  be  the  categories  of 
operations.  The  extended  access  privilege  IT  is  the  collection  of  mapping  from  C  to  V. 

Wo  can  manage  the  access  of  user  by  n  together  with  the  access  hierarchy  provided  by  the 
partial  order  of  access  privileges. 

For  example,  it  is  natural  to  require  that  write  protection  is  tighter  than  read  protection. 
Then  it  is  expressed  by: 

Vf  £  n.  /(‘write')  <  f(ire(ull). 

We  can  also  introduce  the  order  in  17.  For  /,  y  in  17.  y  has  stronger  access  power  than  /  if 

Vc  €  C.f(c)  <  rj(c). 

Then  users  can  be  organized  by  II  with  this  order.  For  example,  a  manager  would  have 
.stronger  access  power  than  his  staff  members  with  this  order. 

C  ADL  Sample  Session 

As  we  mentioned  earlier,  the  implementation  of  the  formalism  in  this  report  is  in  progress.  It. 
is  realized  as  a  data  description  language  called  ADL( Algebraic  Data  Language).  Currently, 
the  system  is  made  of  12,000  lines  of  Common  Lisp  Code.  It  has  the  following  features. 

1.  CLOS-like  Functional  Language 

It  has  CLOS-like  functional  language  with  strong  type  checking  for  hierarchical  types, 
i.e..  it  allows  subtypes.  VVe  can  attach  a  restriction  predicate  to  each  class  to  express  the 
integrity  constraints. 

2.  Lazy  Evaluation  of  Object-binding 

The  binding  of  instances  to  each  class  will  be  delayed  until  necessary.  Moreover,  the 
update  of  instances  are  performed  according  to  the  local  logs  of  classes.  The  dependency 
of  classes,  such  as  “what  update  of  which  class  will  affects  which  class”  is  checked  at 
compile  time.  Since  the  object-binding  is  done  according  to  the  local  update  logs,  the 
update  cost  is  smaller  and  we  can  perform  a  necessary  optimization  according  to  the 
sequences  of  updates  recorded  in  the  logs. 

■i.  Incremental  Class  and  Function  Definition 

New  schema  objects(C-classes)  and  functions  on  (  -clashes  can  be  added  after  instances 
an  bound  to  classes.  If  the  new  classes  contradict  the  instances  of  base  ('  classes,  all 
the  Luther  transactions  may  be  rejected  as  inconsistent.  The  contradicting  instances  of 
derived  ('-classes  will  be  automatically  fixed  when  tin'  object-binding  for  tin'  classes  i> 
performed. 


plll/t 


The  current  version  of  the  language  is  quite  tentative  and  will  be  subjected  to  many  changes 
in  the  future. 

There  are  built  in  classes  and  functions.  For  classes,  we  have  'top',  'bool'. 'number',  string', 
sequence  ,  bag",  'set',  etc.  For  functions,  we  have: 


plus 

:  number  x  number 

—  number 

;  (add  numbers) 

minus 

:  it  it  mbe  r  x  n  umbe  r 

number 

;  (subtract  a  number  from  a  number] 

length 

:  string 

—  number 

;  (string  length) 

substring? 

•  •  • . etc. 

:  string  x  string 

—  bool 

:  (1st  arg.  is  a  substring  of  2nd  arg.? 

The  following  is  the  actual  session  performed  on  this  system.  The  lines  preceded  by  are 
the  comments,  which  were  added  afterwards.  The  highlights  are  in  the  second  half  of  the 
session,  where  the  automatic  integrity  constraints  checking,  incremental  class  definition  and 
object-binding  are  demonstrated. 


ADL[0]>  (lisp  (reset-kb!)) 

; ;  Clear  all  instances  and  initialize  transaction  management 
;  ;  routine . 

rest-kb 


ADL[0]>  (defconcept  person  (base  entity)  (isa  top) 

((name  string)  (address  location.)  (age  number)  (phone  string) 
(occupation  string)  (salary  number)) 

(res  (and  (gt  (age  self)  0)  (It  (age  self)  200)))) 

ADL[0]>  (defconcept  location  (base  entity)  (isa  top) 

((state  string)  (city  string)  (street  string)  (number  string) 
(apartment  string)  (apartment-number  string)) 

(res  true)) 

ADL[0]>  (defconcept  student  (derived  entity)  (isa  person)  () 

(res  (equal  (occupation  self)  "student"))) 

ADL[0]>  (defconcept  professor  (derived  entity)  (isa  person)  () 

(res  (equal  (occupation  self)  "professor"))) 


;;  Ve  have  defined  four  new  C-classes:  person,  location,  student, 
; ;  and  professor  . 

ADL[0]>  (compile) 

;;  recompile  the  classes  and  functions. 


;;  First,  ve  demonstrate  a  nested  transaction  and  object  sharing. 


ADL[G]>  (begm-transaction)  [l] 

;;  begin  the  transaction. 

;;  The  system  supports  nested  transactions. 

;;  The  number  in  the  prompt  "ADL[#]>"  shows  the  nesting  depth. 
ADL[l]>  (insert  (person  (name  "John") 


(address  (location  (state  "CA") 

(city  "Palo  Alto") 

(street  "Yale") 

(number  "2260"))) 

(age  20) 

(salary  40000))) 

ADL[l]>  (set  John  (find  person  (equal  (name  self)  "John"))) 

;;  Any  instance  can  be  bound  to  a  global  variable. 

;;  Note  that  we  don’t  have  to  specify  the  all  of  the  attribute 
;;  values,  because  an  attribute  is  treated  as  a  partial  function. 

ADL[l]>  (insert  (person  (name  "Mary”) 

(address  (location  (state  "NY") 

(city  "New  York") 

(street  "West") 

(number  “47"))) 

(age  18) 

( salary  50000) ) ) 

ADL[l]>  (set  mary  (find  person  (equal  (age  self)  18))) 

ADL[l]>  (end-transaction) 
transaction^]  successfully  terminated 

ADL[0]>  (begin-transaction) [2] 

ADL[l]>  (modify  mary  age  25) 

ADL[l]>  (begm-transaction)  [3] 

ADL[2]>  (begin-transaction) [4] 

ADL[3]>  (modify  mary  age  21) 

;;  We  modified  Mary’s  age  in  the  deepest  level  of  the 
;;  transactions. 

ADL[3]>  (end-transaction) 
transaction[4]  successfully  terminated 

ADL[2]>  (output  mary) 

;;  We  show  that  Mary’s  age  is  actually  modified. 

[person] : 

salary  ->  [number] : 50000 
age  ->  [number]: 21 
address  -> 

[location] : 

number  ->  [string] : ”47" 
street  ->  [string] : "Wert" 
city  ->  [string] : "New  York" 
state  ->  [str ing] : "NY" 
name  ->  [string] : "Mary " 


ADL[2]>  (modify  mary  address  (address  john)) 

;;  Mary’s  address  becomes  the  same  as  John’s  address. 
;;  The  object  is  shared. 

ADL[2]>  (output  person) 

;;  Now,  both  persons  have  the  same  address. 

Instances [person] : : : 

[person] : 

salary  ->  [number] : 50000 
age  ->  [number]: 21 
address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [string] : "CA" 
name  ->  [string] : "Mary" 

[person] : 

salary  ->  [number] : 40000 
age  ->  [number]: 20 
address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [string] :"CA" 
name  ->  [string] : "John" 

ADL [2] >  (modify  (address  mary)  city  "Stanford") 

;;  We  change  the  city  of  Mary’s  address  to  "Stanford" 
;;  Since  the  location  object  is  shared,  this  change  i 
;;  automatically  propagated  to  John’s  address. 

ADL[2]>  (output  person) 

;;  The  change  is  actually  propagated. 

Instances [person] : : : 

[person] : 

salary  ->  [number] : 50000 
age  ->  [number]: 21 
address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Stanford" 
state  ->  [string] : "CA" 
name  ->  [string] : "Mary" 

[person] : 

salary  ->  [number] : 40000 
age  ->  [number] : 20 


jllUII  .  i't 

address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Stanford" 
state  ->  [string] : "CA" 
name  ->  [string] : "John" 

ADL[2]>  (modify  john  age  300) 

; ;  This  change  contradicts  the  integrity  constraints  that 
;;  a  person's  age  should  be  greater  than  0  and  less  than  200. 

ADL[2]>  (end-transaction) 
transaction  [3]  aborted 

;;  The  transaction  in  level  2  is  rejected. 

; ;  Since  the  modification  of  the  addresses  of  John  and  Mary 
;;  are  performed  in  level  2,  it  is  thrown  away. 

ADL [ 1] > 

(modify  john  age  30) 

;;  Just  one  more  change  in  level  1. 

ADL[1]>  (end-transaction) 
transaction  [2]  successfully  terminated 

;;  The  only  changes  performed  in  level  1  have  been  accepted. 

ADL[0]>  (output  person) 

;  ;  We  show  what  has  been  changed. 

Instances [person] : 

[person] : 

salary  ->  [number] : 50000 
age  ->  [number] : 26 
address  -> 

[location]  : 

number  ->  [string] : "47" 
street  ->  [string] : "West" 
city  ->  [string] ; "New  York" 
state  ->  [str ing] : "NY" 
name  ->  [str mg]  :  "Mary" 

[person] : 

salary  ->  [number] : 40000 
age  ->  [number] : 30 
address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [str mg]  :  "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [str ing] : "CA" 
name  ->  [str mg]  :  "John" 

;;  Only  Mary  and  John’s  ages  have  been  changed. 

;;  Next  we  demonstrate  the  automatic  object-binding 
ADL[0]>  (output  student) 
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Instances [student] : 

;;  No  instances  are  bound  to  ’student’. 

ADL[0]>  (begin-transaction) [5] 

ADL[l]>  (modify  John  occupation  "student") 

;;  John  becomes  a  ’student’. 

ADL[l]>  (end-transaction) 
transaction[5]  successfully  terminated 

ADL[0]>  (output  student) 

;;  Now,  John  is  bound  to  ‘student’  as  an  instance. 

Instances  [student] : 

[student] : 

salary  ->  [number] : 40000 
occupation  ->  [string] : "student" 
age  ->  [number]: 30 
address  -> 

[location]  : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [string] : “CA" 
name  ->  [string] : "John" 

ADL[0]>  (begin-transaction) [6] 

ADL[l]>  (modify  John  occupation  "professor") 

;;  John  becomes  a  ’professor’.  He  is  no  longer  a  'student'. 

ADL[l]>  (end-transaction) 

transact  ion  [6]  successfully  terminated 

ADL[0]>  (output  student) 

Instances  [student]  : 

;;  He  is  no  longer  bound  to  ’student’. 

ADL [0] >  (  output  professor) 

;  ;  Now  he  has  been  moved  from  ’student’  to  ’professor1. 
Instances [prof essor] : : 

[professor] : 

salary  ->  [number] : 40000 
occupation  ->  [string] : "prof essor" 
age  ->  [number] : 30 
address  -> 

[location]  : 

number  ->  [string] : "2260" 


street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [string] : "CA" 
name  ->  [string] : "John" 

, ;  Next  demonstration  shows  the  integrity  constraints  involving  several 
; ;  C-classes . 

ADL[0]>  (defconcept  I-am-the-nchest  (base  entity)  (isa  top) 

((name  string)  (salary  number)) 

(res  (forall  ((x  person))  (gt  (salary  self)  (salary  x))))) 

;;  First,  ue  define  a  new  C-class,  which  claims  that 
;;  it  is  richer  than  any  'person*. 

ADL[0]>  (compile) 

;;  Incrementally  compile  the  schema. 

ADL[0]>  (begin-transaction) [7] 

ADL[l]>  (insert  ( I-am-the-richest  (name  "tyrant")  (salary  10000))) 

ADL [l] >  (end-transaction) 
transaction®  aborted 

;;  Since  there  is  already  a  'person'  whose  'salary'  is 
;;  more  than  10000,  the  transaction  is  rejected. 

ADL [0] >  (begin-transaction) [8] 

ADL[l]>  (insert  (I-am-the-richest  (name  "tyrant")  (salary  100000))) 

ADL[l]>  (end-transaction) 
transaction®  successfully  terminated 

;;  No  'person'  earns  more  than  100000.  So,  this  transaction 
; ;  is  accepted . 

ADL[0]>  (begm-transaction)  [9] 

;;  Now,  we  try  to  insert  a  'person'  whose  salary  is 
;;  More  than  "tyrant.” 

ADL C 1 ] >  (insert  (person  (name  "nchman")  (age  45)  (salary  110000))) 

ADL[l]>  (end-transaction) 
transaction®  aborted 

;;  Although,  "nchman"  satisfies  the  local  constraint  on 
;;  the  age,  this  transaction  is  rejected,  because  "tyrant’’ 

;;  doesn’t  allow  a  richer  'person'  than  him. 

;;  Ue  can  use  any  first  order  formula  to  express  the  integrity  constraints. 

;;  The  following  example  demonstrates  the  use  of  quantified  first  order  formulas. 
;;  Since  the  schema  objects  can  be  incrementally  defined,  we  can  express 
;;  complicated  query  by  a  schema  definition. 
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ADL [0] >  (defconcept  oldest-person  (derived  entity)  (isa  person)  nil 
(res  (forall  ( (x  person))  (ge  (age  self)  (age  x))))) 

ADL[0]>  (defconcept  the-oldest-person  (derived  entity)  (isa  person)  nil 
(res  (forall  ((x  person)) 

(if  (not  (equal  self  x))  (gt  (age  self)  (age  x)))))  ) 

;;  Two  classes  are  added.  The  class  'the-oldest-person' 

;;  should  be  a  person  who  is  really  older  than  any  one  else. 

ADL[0]>  (compile) 

ADL[0]>  (output  oldest-person) 

;;  Both  'oldest-person'  and  'the-oldest-person'  has  an 
;;  instance,  because  there  is  only  one  person  with  the 
; ;  oldest  age . 

Instances [oldest-person] : : 

[oldest-person] : 

salary  ->  [number] : 40000 
occupation  ->  [string] : "prof essor" 
age  ->  [number]: 30 
address  -> 

[location]  : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  (string] : "CA" 
name  ->  [string] : "John" 

ADL[0]>  (output  the-oldest-person) 

Instances [the-oldest-person] : : 

[the-oldest-person] : 
salary  ->  [number] : 40000 
occupation  ->  [string] : "prof essor" 
age  ->  [number]: 30 
address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [string] : "CA" 
name  ->  [string] : "John" 

ADL[0]>  (begin-transaction) [10] 

;;  Now,  we  add  one  more  'person'  whose  age  is  the  oldest. 

ADL[1]>  (insert  (person  (name  "Kate")  (age  30)  (salary  45000))) 

ADL[1]>  (end-transaction) 

transaction[10]  successfully  terminated 

;;  Now,  there  are  two  persons  with  the  oldest  age  30. 
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ADL[0] >  (output  oldest-person) 

;;  So,  ’oldest-person'  has  two  instances. 

Instances [oldest-person] : : 

[oldest-person] : 

salary  ->  [number] : 40000 
occupation  ->  [string] : "prof essor" 
age  ->  [number]: 30 
address  -> 

[location] : 

number  ->  [string] : "2260" 
street  ->  [string] : "Yale" 
city  ->  [string] : "Palo  Alto" 
state  ->  [string] : "CA" 
name  ->  [string] : "John" 

[oldest-person] : 
salary  ->  [number] : 45000 
age  ->  [number] : 30 
name  ->  [string] : "Kate" 

ADL[0]>  (output,  the-oldest-person) 

Instances [the-oldest-person] : : : 

;;  But  ’the-oldest-person’  has  no  instances,  because 

there  is  no  person  who  is  strictly  older  than  anyone  else. 

ADL [o] > 


pay  I  ( 
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